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(57) Abstract: Methods for manipulating carbohydrate processing pathways in cells of interest are provided. Methods are directed 
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encompass the implementation of new carbohydrate bioassays, the examination of a selection of insect cell lines and the use of 
bioinformatics to identify gene sequences for critical processing enzymes. The compositions comprise cells of interest producing 
^ sialylated glycoproteins. The methods and compositions are useful for heterologous expression of glycoproteins. 
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ENGINEERING INTRACELLULAR SIALYLATION PATHWAYS 

FIELD OF THE INVENTION 

5 The invention relates to methods and compositions for expressing sialylated 

glycoproteins in heterologous expression systems, particularly insect cells. 

BACKGROUND OF THE INVENTION 

While heterologous proteins are generally identical at the amino acid level, 

1 0 their post-translationally attached carbohydrate moieties often differ from the 

carbohydrate moieties found on proteins expressed in their natural host species. Thus, 
carbohydrate processing is specific and limiting in a wide variety of organisms 
including insect, yeast, mammalian, and plant cells. 

The baculovirus expression vector has promoted the use of insect cells as hosts 

15 for the production of heterologous proteins (Luckow et al. (1993) Curr. Opin. 

Biotech. 4:564-572, Luckow et al. (1995) Protein production and processing from 
baculovirus expression vectors). Commercially available cassettes allow rapid 
generation of recombinant baculovirus vectors containing foreign genes under the 
control of the strong, polyhedrin promoter. This expression system is often used to 

20 produce heterologous secreted and membrane-bound glycoproteins normally of 
mammalian origin. 

However, post-translational processing events in the secretory apparatus of 
insect cells yield glycoproteins with covalently-linked oligosaccharide attachments 
that differ significantly from those produced by mammalian cells. While mammalian 

25 cells often generate complex oligosaccharides terminating in sialic acid (SA), insect 
cells typically produce truncated (paucimannosidic) and hybrid structures terminating 
in mannose (Man) or N-acetylglucosamine (GlcNAc) (Figure 1). The inability of 
insect cell lines to generate complex carbohydrates comprising sialic acid 
significantly limits the wider application of this expression system. 

30 The carbohydrate composition of an attached oligosaccharide, especially sialic 

acid, can affect a glycoprotein's solubility, structural stability, resistance to protease 
degradation, biological activity, and in vivo circulation (Goochee et al. (1991) 
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Bio/technology 9:1347-1355, Cumming etal. (1991) Glycobiology 1:115-130, 
Opdenakker et al. (1993) FASEB J. 7:1330, Rademacher et al. (1988) Ann. Rev. 
Biochem., Lis et al. (1993) Eur. J. Biochem. 218:1-27). The terminal residues of a 
carbohydrate are particularly important for therapeutic proteins since the final sugar 
5 moiety often controls its in vivo circulatory half-life (Cumming et al. (1 991) 

Glycobiology 1:115-130). Glycoproteins with oligosaccharides terminating in sialic 
acid typically remain in circulation longer due to the presence of receptors in 
hepatocytes and macrophages that bind and rapidly remove structures terminating in 
mannose (Man), N-acetylglucosamine (GlcNAc), and galactose (Gal), from the 

10 bloodstream (Ashwell et al. (1974) Giochem. Soc. Symp. 40. T 17-124, Goochee et al. 
(1991) Bio/technology 9:1347-1355, Opdenakker et al. (1993) FASEB J. 7:1330). 
Unfortunately, Man and GlcNAc are the residues most commonly found on the 
termini of glycoproteins produced by insect cells. The presence of sialic acid can also 
be important to the structure and function of a glycoprotein since sialic acid is one of 

1 5 the few sugars that is charged at physiological pH. The sialic acid residue is often 

involved in biological recognition events such as protein targeting, viral infection, cell 
adhesion, tissue targeting, and tissue organization (Brandley et al. (1986) J. of 
Leukocyte bio. 40:97-1 11, Varki et al. (1997) FASEB 11:248-255, Goochee et al. 
(1991) Bio/technology 9:1347-1355, Lopez etal. (1997) Glycobiology 7:635-651, 

20 Opdenakker et al. (1 993) FASEB J. 7:1330). 

The composition of the attached oligosaccharide for a secreted or membrane- 
bound glycoprotein is dictated by the structure of the protein and by the post- 
translational processing events that occur in the endoplasmic reticulum and Golgi 
apparatus of the host cell. Since the secretory processing machinery in mammalian 

25 cells differs from that in insect cells, glycoproteins with very different carbohydrate 
structures are produced by these two host cells (Jarvis et al. (1995) Virology 212:500- 
511, Manila/. (1996) J Biol. Chem. 271:16294-16299, Altmann et al. (1996) 
Trends in Glycoscience and Glycotechnology 8:101-1 14). These differences in 
carbohydrate structure can have dramatic effects on the in vitro and in vivo properties 

30 of the resulting glycoprotein. For example, the in vitro activity of human thyrotropin 
(hTSH) expressed in insect cells was five times higher than the activity of the same 
glycoprotein produced from mammalian Chinese hamster ovary (CHO) cells 
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(Grossman et al. (1997) Endocrinology 138:92-100). However, the in vivo activity of 
the insect cell-derived product was substantially lower due to its rapid clearance from 
injected rats. The drop in in vivo hTSH activity was linked to the absence of 
complex-type oligosaccharides terminating in sialic acid in the insect cell product 
5 (Grossman et al. (1 997) Endocrinologyl 138:92-100). 

N-glycosylation is highly significant to glycoprotein structure and function. In 
insect and mammalian cells N-glycosylation begins in the endoplasmic reticulum 
(ER) with the addition of the oligosaccharide, Glc3Man9GlcNAc 2 onto the asparagine 
(Asn) residue in the consensus sequence Asn-X-Ser/Thr (Moremen, et al. (1994) 
10 Glycobiology 4:1 13-125, Varki et al. (1993) Glycobiology 3(2):97-130, Altmann et al. 
(1996) Trends in Glycoscience and Glycotechnology 8:101-1 14). As the glycoprotein 
passes through the ER and Golgi apparatus, enzymes trim and add different sugars to 
this N-linked glycan. These carbohydrate modification steps can differ in mammalian 
and insect hosts. 

15 In mammalian cell lines, the initial trimming steps are followed by the 

enzyme-catalyzed addition of sugars including N-acetylglucosamine (GlcNAc), 
galactose (Gal), and sialic acid (SA) by the steps shown in Figure 2, and as described 
in Goochee et al. (1991) Bio/technology 9:1347-1355. 

In insect cells, N-linked glycans attached to heterologous and homologous 

20 glycoproteins comprise either high-mannose (Man^GlcNAc^) or truncated 

(paucimannosidic) (Man3.2GlcNA.c2) oligosaccharides; occasionally comprising 
alpha(l, 6)-fucose (Figure 3; Jarvis etal. (1989) Mol. Cell. Biol. 9:214-223, Kuroda 
et al. (1990) Virology 174:418-329, Marz etal. (1995) Glycoproteins 543-563, 
Altmann et al. (1996) Trends in Glycoscience and Glycotechnology 8:101-1 14). 

25 These reports primarily directed to Sf-9 or Sf-21 cells from Spodoptera frugiperda, 
indicated that insect cells could trim N-linked oligosaccharides but could not elongate 
these trimmed structures to produce complex carbohydrates. Reports from other 
insect cell lines, including Tricoplusia ni (T. ni; High Five™) and Estigmena acrea 
(Ea-4), indicated the presence of limited levels of partially elongated hybrid 

30 (structures with one terminal Man branch and one branch with terminal Gal, GlcNAc, 
or another sugar; Figure 4a) and complex (structures with two non-Man termini; 
Figure 4b) N-linked oligosaccharides (Oganah et al. (1996) Bio/Technology 14:197- 
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202, Hsu etal. (1997) J. Biol. Chem. 272:9062-9070). Low levels of GlcNAc 
transferase I and II (GlcNAc TI and Til), fucosyltransferase, mannosidases I and II, 
and Gal transferase (Gal T) have been reported in these insect cells; indicating a 
limited capability for production of these hybrid and complex N-linked 
5 oligosaccharides in these cells (Velardo etal. (1993) J. Biol. Chem. 268:17902-17907, 
Altmann etal. (1996) Trends in Glycoscience and Glycotechnology 8:101-114, van 
Die etal. (1996) Glycobiology 6:157-164). 

However; most insect cell derived glycoproteins lack complex N-glycans. 
This absence may be attributed to the presence of the hexosaminidase N- 

1 0 acetylglucosaminidase that cleaves GlcNAc attached to the alpha{\, 3) Man branch to 
generate paucimannosidic oligosaccharides (Licari et al. (1993) Biotech. Prog. 9:146- 
152, Altmann et al. (1995) J. Biol. Chem. 270:17344-17349). Chemicals have been 
added in an attempt to inhibit this glycosidase activity, but significant levels of 
paucimannosidic structures remain even in the presence of these inhibitors (Wagner et 

15 al. (1996) J. Virology 70:4103-4109). 

Manipulating carbohydrate processing in insect cells has been attempted; and 
in mammalian cells, the expression of sialyltransferases, galactosyltransferases and 
other enzymes is well established in order to enhance the level of oligosaccharide 
attachment (see U.S. PateniNo. 5,047.335). However, in these cases, the presence of 

20 the necessary donor nucleotide substrates, most significantly the sialylation 

nucleotide, CMP-sialic acid, in the proper subcellular compartment has been assumed. 
Attempts to manipulate carbohydrate processing have been made by expressing single 
transferases such as N-Acetylglucosamine transferase I (GlcNAc Tl), galactose 
transferase (GAL T), or sialyltransferase (Lee etal (1989) J. Biol. Chem. 264:13848- 

25 13855, Wagner et al. (1996) Glycobiology 6:165-175, Jarvis et al. (1996) Nature 
Biotech. 14:1288-1292, Hollister et al. (1998) Glycobiology 5:473-480, Smith et al. 
(1990) J. Biol. Chem. 265:6225-6234, Grabenhorst etal. (1995) Eur. J. Biochem. 
232:718-725). Introduction of a mammalian beta{\, 4)-GalT using viral vectors 
(Jarvis et al. (1995) Virology 212:500-51 1) or stably-transformed cell lines (Hollister 

30 et al. (1 998) Glycobiology 5:473-480) indicates that both approaches can enhance the 
extent of complex glycosylation of foreign glycoproteins expressed in insect cells. 
GlcNAcTl co-expression can increase the number of recombinant glycoproteins with 
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oligosaccharides containing GlcNAc on the Man alpha{\, 3) branch (Jarvis et al. 
(1996) Nature Biotech. 14:1288-1292, Jarvis et al. (1995) Virology 212:500-511, 
Hollister et al (1998) Glycobiology 5:473-480; Wagner et al (1996) Glycobiology 
6:165-175). 

5 However, the production of complex carbohydrates comprising sialic acid has 

not been observed in these studies. Sialylation of a single recombinant protein 
(plasminogen) produced in baculovirus-infected insect cells has been reported 
(Davidson et al. (1990) Biochemistry 29:5584-5590), but findings appear to be 
specific to this glycoprotein. Conversely, many reports indicate the complete absence 
10 of any attached sialic acid on glycoproteins from all insect cell lines tested to date 
(Voss etal. (1993) Eur. J. Biochem. 217:913-919, Jarvis et al (1995) Virology 
212:500-511, Marz etal. (1995) Glycoproteins 543-563, Altmann etal. (1996) Trends 
in Glycoscience and Glycotechnology 8:101-114, Hsu et al (1997) J. Biol. Chem. 
272:9062-9070). 

1 5 The reason for this absence of sialylated glycoproteins was initially puzzling 

since polysialic acid structures were obtained in Drosophila embryos (Roth et al. 
(1992) Science 255:673-675). However, as demonstrated herein, it is now evident 
that insect cell lines generate very little sialic acid as compared to mammalian CHO 
cells (See Figure 16). With very little sialic acid, the insect cells cannot generate the 

20 donor nucleotide CMP-sialic acid essential for sialylation. A similar lack or 

limitation in donor nucleotide substrates may be observed in other eukaryotes as well. 
Thus, the co-expression of sialyltransferase and other transferases must be 
accompanied by the intracellular generation of the proper donor nucleotide substrates 
and the proper acceptor substrates in order for the production of sialylated and other 

25 complex glycoproteins in eukaryotes. In addition, sialic acid and CMP-sialic acid are 
not permeable to cells so these substrates can not be provided directly to the medium 
of the cultures (Bennett et al (1981) J. Cell. Biol 55:1-15). 

The manipulation of post-translational processing is particularly relevant to 
biotechnology since recombinant DNA products generated in different hosts are 

30 usually identical at the amino acid level and differ only in the attached carbohydrate 
composition (Goochee et al. (1991) Bio/technology 9:1347-1355). Engineering 
carbohydrate pathways is useful to make recombinant DNA technology more versatile 
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and expand the number of hosts that can generate particular glycoforms. This 
flexibility could ultimately lower biotechnology production costs since host efficiency 
would be the primary factor dictating which expression system is chosen rather than a 
host's capacity to produce a specific glycoform. Furthermore, carbohydrate 
5 engineering is useful to tailor a glycoprotein to include specific oligosaccharides that 
could alter biological activity, structural properties or circulatory targets. Such 
carbohydrate engineering efforts will provide a greater variety of recombinant glyco- 
products to the biotechnology industiy. 

Glycoproteins containing sialylated oligosaccharides would have improved in 

10 vivo circulatory half-lives that could lead to their increased utilization as vaccines and 
therapeutics. In particular, complex sialylated glycoproteins from insect cells would 
be more appropriate biological mimics of native mammalian glycoproteins in 
molecular recognition events in which sialic acid plays a role. 

Therefore, manipulating carbohydrate processing pathways in insect and other 

15 eukaryotic cells so that the cells produce complex sialylated glycoproteins is useful 
for enhancing the value of heterologous expression systems and increasing the 
application of heterologous cell expression products as vaccines, therapeutics, and 
diagnostic tools; for increasing the variety of glycosylated products to be generated in 
heterologous hosts; and for lowering biotechnology production costs, since particular 

20 expression systems can be selected based on efficiency of production rather than the 
capacity to produce particular product glycoforms. 

SUMMARY OF THE INVENTION 

Compositions and methods for producing glycoproteins having sialylated 
25 oligosaccharides are provided. The compositions of the invention comprise enzymes 
involved in carbohydrate processing and production of nucleotide sugars, nucleotide 
sequences encoding such enzymes, and cells transformed with these nucleotide 
sequences. The compositions of the invention are useful in methods for producing 
complex sialylated glycoproteins in cells of interest including, but not limited to, 
30 mammalian cells and non-mammalian cells (e.g., insect cells). 

The sialylation process involves the post-translational addition of a donor 
substrate, cytidine monophosphate-sialic acid (CMP-SA) onto a specific acceptor 
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carbohydrate (GalGlcNAcMan-R) via an enzymatic reaction catalyzed by a 
sialyltransferase in the Golgi apparatus. Since one or more of these three reaction 
components (i.e., acceptor, donor substrate, and the enzyme sialyltransferase) is 
limiting or absent in certain cells of interest, methods are provided to enhance the 
5 production of the limiting components. Polynucleotide sequences encoding -the 

enzymes used according to the methods of the invention are known or novel bacterial 
invertebrate, fungal, or mammalian sequences and/or fragments or variants thereof , 
that are optionally identified using bioinformatics searches. According to one 
embodiment of the invention, completion of the sialylation reaction is achieved by 

10 expressing a sialyltransferase enzyme, or a fragment or variant thereof, in the 

presence of acceptor and/or donor substrates. The invention also provides an assay 
for sialylation, wherein the structures and compositions of N-linked oligosaccharides 
attached to a model secreted glycoprotein, (e.g., transferrin), is elucidated using 
multidimensional chromatography. 

1 5 Cells of interest that have been recombinantly engineered to produce new 

forms of sialylated glycoproteins, higher concentrations of sialylated glycoproteins, 
and/or elevated concentrations of donor substrates (.g., nucleotides sugars) required 
for sialylation, as well as kits for expression of sialylated glycoproteins are also 
provided. 

20 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 depicts the typical differences in insect and mammalian carbohydrate 
structures. 

25 

Figure 2 depicts the enzymatic generation of a complex sialylated 
carbohydrate in mammalian cells. 



30 



Figure 3 depicts a Paucimannosidic oligosaccharide. 

Figure 4a depicts a hybrid glycan from Estigmena acrea (Ea-4) insect cells. 
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Figure 4b depicts a complex glycan from Estigmena acrea (Ea-4) insect cells. 

Figure 5 depicts the nucleotide sugar production pathways in mammalian and 
E. coli cells leading to sialylation. 

5 

Figure 6 depicts a chromatogram of labeled oligosaccharides separated by 
reverse phase High Performance Liquid Chromatography (HPLC) on an ODS-silica 
column. Using this technique, oligosaccharides are fractionated according to their 
carbohydrate structures. Panel "L" represents cell lysate fractions and panel "S" 
1 0 represents cell supernatant fractions. 

Figure 7 depicts the structure of Oligosaccharide G. 

Figure 8 depicts the glycosylation pathway in Trichoplusia ni insect cells 
1 5 (High Five™ cells; Invitrogen Corp., Carlsbad, CA, USA). 

Figure 9 depicts the chromatogram of a Galactose-transferase assay following 
High Performance Anion Exchange Chromatography (HPAEC), as described in the 
Examples and references cited therein. 

20 

Figure 10 depicts the chromatogram of a 2,3-Sialyltransferase assay following 
Reverse Phase-High Performance Liquid Chromatography (RP-HPLC), as described 
in the Examples. 

25 Figure 1 1 depicts the results of a Galactose-transferase (Gal-T) assay of insect 

cell lysates performed using a Europium (Eu +3 )-labeled Ricinus cummunis lectin 
(RCA 120) probe; which specifically binds Gal or GalNAc oligosaccharide structures 
as described in the Examples. Each column represents the Gal-T activity in a given 
sample; Column (A) represents boiled T. ni cell lysates, Column (B) represents 

30 normal T. ni cell lysates, Column (C) represents activity in 0.5 mU of enzyme 

standard, Column (D) represents lysate from T. ni cells infected with a baculovirus 
coding for GalT, Column (E) represents lysates from Sf-9 cells stably transfected with 
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the GalT gene. Figure 12 depicts the product of reacting UDP-Gal-6-Naph with Dans- 
AE-GlcNAc in the presence of GalT. 

Figure 12 depicts the reaction products resulting from incubation of UDP-Gal- 
5 6-Naph and Dans-AE-GlcNAc in the presence of Galactose-transferase, as described 
in the "Experimental" section below. 

Figure 13 depicts the distinguishing emission spectra of GalT assay reactants 
and products, as described in the "Experimental" section below. Irradiation of the 
10 naphthyl group in UDP-Gal-6-Naph at 260-290 nm ("ex") results in an emission peak 
at 320-370 nm ("em" dotted line) while irradiation of the Galactose-transferase 
reaction products at these same low wavelengths results in energy transfer to the 
dansyl group and an emission peak at 500-560 nm ("em" solid line). 

1 5 Figure 14 depicts the oxidation reaction of sialic acid. 

Figure 15 schematically depicts anew GlcNAc Tl assay utilizing a synthetic 
6-aminohexyl glycoside of the trimannosyl N-glycan core structure labeled with 
DTPA (Diethylenetriaminepentaacetic acid) and complexed with Eu +3 (see 

20 "Experimental" section below). This substrate is incubated with insect cell lysates or 
positive controls containing GlcNAc Tl and UDP-GlcNAc. Chemical inhibitors are 
added to minimize background N-acetylglucosaminidase activity. After the reaction, 
an excess of Crocus lectin CVL (Misaki et al. (1997) J. Biol. Chem. 272:25455- 
25461), which specifically binds the trimannosyl core, is added. The amount of lectin 

25 required to bind all the trimannosyl glycoside (and hence all the Eu +3 label) in the 
absence of any GlcNAc binding is predetermined. Following an ultrafiltration step, 
the glycoside modified with GlcNAc (not binding CVL) appears in the filtrate. 
Measurement of the Eu +3 fluorescence in the filtrate reflects the level of GlcNAc Tl 
activity in the culture lysates. 

30 

Figure 16 depicts a chromatogram of sialic acid levels in SF9 insect cells and 
CHO (chinese hamster ovary) cells. In the panel labeled "Sf-9 Free Sialic Acid 
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Levels" the known sialic acid standard elutes just prior to 10 minutes, while no 
corresponding sialic acid peak can be detected (above background levels) in Sf-9 
cells. In the panel labeled "CHO sialic acid levels" the sialic acid standard elutes at 
approximately 9 minutes, while bound and free (released by acid hydrolysis) sialic 
5 acid peaks are observed at similar elution positions. 

Figure 17 depicts how selective inhibition of N-acetylglucosaminidase allows 
for production of complex oligosaccharide structures. 

10 Figure 18 depicts ethidium bromide-stained agarose gels following 

electrophoresis of PCR amplification products from Sf9 genomic DNA or High 
Five™ (Invitrogen Corp., Carlsbad, CA, USA) cell cDNA templates using degenerate 
primers corresponding to three different regions conserved within N- 
acetylglucosaminidases. 

15 

Figure 19 depicts two potential specific chemical inhibitors of N- 
acetylglucosaminidase. 

Figure 20 schematically depicts that the overexpression of various 
20 glycosyltransferases leads to greater production of oligosaccharide acceptor 
substrates. 

Figure 21 depicts three possible N-glycan acceptor structures which include 
the terminal Gal (G) acceptor residue required for subsequent sialylation. 

25 

Figure 22 depicts a structure of CMP-sialic acid (CMP-SA). 

Figure 23 depicts a metabolic pathway for ManNAc (N-acetylmannosamine) 
from glucosamine and N-acetylglucosamine (GlcNAc). 

30 

Figure 24 depicts a ManNAc (N-acetylmannosamine) to sialic acid metabolic 
pathway. 
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Figure 25 depicts the formation of CMP-sialic acid (CMP-SA) catalyzed by 
CMP-SA synthetase. 

5 Figure 26 depicts detection of purified (P) transferrin (hTf) or transferrin from 

unpurified insect cell lygates (M) following separation on an SDS-PAGE gel, as 
described the Examples. 

Figure 27 depicts the nucleotide sequence of human aldolase. 

10 

Figure 28 depicts the amino acid sequence of human aldolase encoded by the 
sequence shown in Figure 27. 

Figure 29 depicts the nucleotide sequence of human CMP-SA synthetase 
1 5 (cytidine monophosphate-sialic acid synthetase) 

Figure 30 depicts the amino acid sequence of human CMP-SA synthetase 
encoded by the sequence shown in Figure 29. 

20 Figure 3 1 depicts the nucleotide sequence of human sialic acid synthetase 

(human SA-synthetase; human SAS). 

Figure 32 depicts the amino acid sequence of human SA-synthetase (SAS) 
encoded by the sequence shown in Figure 31. 

25 

Figure 33 depicts the types and quantities of oligosaccharide structures found 
on recombinant human transferrin in the presence and absence of Gal T 
overexpression. 



30 



Figure 34 depicts bacterial and mammalian sialic acid metabolic pathways. 
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Figure 35 depicts human sialic acid synthetase (SAS) genetic information: 
(A) depicts an alignment of the polypeptide encoded by the human SAS 
polynucleotide open-reading frame; (B) shows the amino acid sequence homology 
between human SAS (top) and bacterial sialic acid synthetase {NeuB) (bottom). 

5 

Figure 36 (A) depicts an autoradiogram of human sialic acid synthetase gene 
products following gel electrophoresis. The lanes labeled "In Vitro" represent in vitro 
transcription and translation products of SAS cDNA (amplified via polymerase chain 
reaction (PCR)). Lane 1 ("pA2") depicts a negative control reaction in which pA2 

10 plasmid (without the SAS cDNA) was PCR amplified, transcribed, translated, and 
radiolabled. Lane 2 ("pA2-SAS ") depicts a sample reaction in which pA2-SAS 
plasmid (containing the human SAS cDNA) was PCR amplified, transcribed, 
translated, and radiolabeled. Lane 3 ("Marker") depicts radiolabeled protein standards 
migrating at approximately 66, 46, 30, 21.5, and 14.3 kD. The lanes labeled "Pulse 

15 Label" show radioactive 35 S pulse labeling of polypeptides from insect cells infected 
by virions not containing or containing the human SAS cDNA. Lane 4 ("A35") 
depicts a negative control reaction of radiolabled polypeptides from insect cells 
infected with virions not containing the SAS cDNA. Lane 5 ("AcSAS") depicts a 
sample reaction of radiolabeled polypeptides from insect cells infected with 

20 baculovirus containing the human SAS cDNA. Figure 36 (B) depicts an RNA 
(Northern) blot of human tissues (spleen, thymus, prostate, testis, ovary, small 
intestine, peripheral blood lymphocytes (PBL), colon, heart, brain, placenta, lung, 
liver, skeletal muscle, kidney, and pancreas) probed for sialic acid synthetase RNA 
transcripts. Transcript sizes (in kilobases) are indicated by comparison to the scale on 

25 the left side. 

Figure 37 depicts chromatograms indicating the in vivo sialic acid content of 
various cells as monitored following DMB derivitization and reverse phase HPLC 
separation. Figure 37 (A) depicts the sialic acid content of lysed cell lines after 
30 filtration through a 10,000 MWCO membrane. The cell lines analyzed were Sf-9 

(insect) cells in standard media, SF-9 cells supplemented with 10% FBS (fetal bovine 
serum), or CHO (Chinese Hamster Ovary) cells. The original chromatogram values 
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have been divided by protein concentration to normalize chromatograms. The 
standards shown are Neu5Ac at 1000 fmol, Neu5Gc at 200 fmol, and KDN at 50 
ftnol. Figure 37 (B) depicts a chromatogram of the sialic acid content of lysates from 
various Sf-9 cells. "AcSAS Infected" cell lysates were from Sf-9 cells infected with 
5 baculovirus containing the human S AS cDNA. The Neu5 Ac and KDN "Standards" 
are shown at 1,000 fmol concentrations. "A35 Infected" cell lysates are from Sf-9 
infected by baculovirus not containing the SAS cDNA. "Uninfected" cell lysates are 
from normal Sf-9 cells not infected by any baculovirus. Original chromatogram 
values have been divided by protein concentration to normalize chromatograms. 

10 Figure 37 (C) depicts a chromatogram of the sialic acid content from lysates of Sf-9 
grown in media supplemented by 10 mM ManNAc; cells were infected or not infected 
with baculovirus as shown in Figure 37 (B). Original chromatogram values have been 
divided by protein concentrations to normalize chromatograms. Neu5Ac and KDN 
standards represent 1,000 fmol. Figure 37(D) HPAEC (high performance anion- 

15 exchange chromatography) analysis of lysates from Sf-9 cells infected with AcSAS or 
A35 baculovirus with and without aldolase treatment. Samples were diluted prior to 
column loading to normalize sialic acid quantities based on original sample protein 
concentration. Neu5Ac standard is shown at 250 pmol and KDN standard is shown at 
lOOpmol. 

20 

Figure 38 depicts chromatograms of in vitro assays for sialic acid 
phosphorylation activity. Assays were performed with and without alkaline 
phosphatase (AP) treatment. Figure 38 (A) depicts chromatogram results of a 
Neu5Ac-9-phosphate assay performed using lysates from Sf-9 cells infected with the 
25 AcSAS baculovirus (containing the human SAS cDNA). KDN and Neu5Ac 

standards are shown at 5000 fmol. Figure 38 (B) depicts chromatogram results of a 
KDN-9-phosphate assay performed using lysates from Sf-9 cells infected with the 
AcSAS baculovirus (containing the human SAS cDNA). KDN and Neu5Ac 
standards are shown at 5000 fmol. 

30 

Figure 39 depicts a chromatogram demonstrating production of sialylated 
nucleotides in SF-9 insect cells following infection with CMP-SA synthetase and SA 
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synthetase containing baculoviruses. Sf-9 cells were grown in six well plates and 
infected with baculovirus containing CMP-SA synthase and supplemented with 10 
mM ManNAc ("CMP" line), with baculovirus containing CMP-SA synthase and SA 
synthase plus 10 mM ManNAc supplementation ("CMP+SA" line), or with no 
5 baculovirus and no ManNAc supplementation ("SF9" line). 

DETAILED DESCRIPTION OF THE INVENTION 
Compositions and methods for producing glycoproteins with sialylated 
oligosaccharides are provided. In particular, the carbohydrate processing pathways of 

1 0 cell lines of interest are manipulated to produce complex sialylated glycoproteins. 
Such sialylated glycoproteins find use as pharmaceutical compositions, vaccines, 
diagnostics, therapeutics, and the like. 

Cells of interest include, but are not limited to, mammalian cells and non- 
mammalian cells, such as, for example, CHO, plant, yeast, bacterial, insect, and the 

15 like. The methods of the invention can be practiced with any cells of interest. By 
way of example, methods for the manipulation of insect cells are described fully 
herein. However, it is recognized that the methods may be applied to other cells of 
interest to construct processing pathways in any cell of interest for generating 
sialylated glycoproteins. 

20 Oligosaccharides on proteins are commonly attached to asparagine residues 

found within Asn-X-Ser/Thr consensus sequences; such asparagine-linked 
oligosaccharides are commonly referred to as "N-linked". The sialylation of N-linked 
glycans occurs in the Golgi apparatus by the following enzymatic mechanism: CMP- 
SA + GalGlcNAcMan-R sialyltransferase SAGalGlcNAcMan-R + CMP. The 

25 successful execution of this sialylation reaction depends on the presence of three 

elements: 1) the correct carbohydrate acceptor substrate (designated GalGlcNAcMan- 
R in the above reaction; where the acceptor substrate is a branched glycan, 
GalGlcNAcMan is comprised by at least one branch of the glycan, the Gal is a 
terminal Gal, and R is an N-linked glycan); 2) the proper donor nucleotide sugar, 

30 cytidine monophosphate-sialic acid (CMP-SA); and 3) a sialyltransferase enzyme. 
Each of these reaction components is limiting or missing in insect cells (Hooker et al. 
(1 997) Monitoring the glycosylation pathway of recombinant human interferon- 
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gamma produced by animal cells , Hsu et al. (1997) J. Biol. Chem. 272:9062-9070, 
Jarvis et al. (1995) Virology 212:500-51 1, Jenkins et al. (1998) Cell Culture 
Engineering VI, Oganah et al. (1996) Bio/Technology 14:197-202). 

It will be apparent to those skilled in the art that where a cell of interest is 
5 manipulated according to the methods of the invention such that the cell produces a 
desired level of the donor substrate CMP-SA, and expresses a desired level of 
sialyltransferase; any oligosaccharide or monosaccharide, any compound containing 
an oligosaccharide or monosaccharide, any compatible aglycon (for example Gal- 
sphingosine), any asparagine (N)-linked glycan, any serine- or threonine-linked (O- 
10 linked) glycan, and any lipid containing a monosaccharide or oligosaccharide 

structure can be a proper acceptor substrate and can be sialylated within the cell of 
interest. 

Accordingly, the methods of the invention may be applied to generate 
sialylated glycoproteins for which the acceptor substrate is not necessarily limited to 

15 the structure GalGlcNAcMan-R, although this structure is particularly recognized as 
an appropriate acceptor substrate structure for production of N-linked sialylated 
glycoproteins. Thus, according to the methods of the present invention, the acceptor 
substrate can be any glycan. Preferably, the acceptor substrate according to the 
methods of the invention is a branched glycan. Even more preferably, the acceptor 

20 substrate according to the methods of the invention is a branched glycan comprising a 
terminal Gal in at least one branch of the glycan. Yet even more preferably, the 
acceptor substrate according to the methoids of the invention has the structure 
GalGlcNAcMan in at least one branch of the glycan and the Gal is a terminal Gal. 
It will also be apparent to those skilled in the art that engineering the 

25 sialylation process into cells of interest according to the methods of the present 

invention requires the successful manipulation and integration of multiple interacting 
metabolic pathways involved in carbohydrate processing. These pathways include 
participation of glycosyltransferases, glycosidases, the donor nucleotide sugar (CMP- 
SA) synthetases, and sialic acid transferases. "Carbohydrate processing enzymes" of 

30 the invention are enzymes involved in any of the glycosyltransfer, glycosidase, CMP- 
SA synthesis, and sialic acid transfer pathways. Known carbohydrate engineering 
efforts have generally focused on the expression of transferases (Lee et al. (1989) J. 
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Biol. Chem. 264:13848-13855, Wagner et al. (1996) J. Virology 70:4103-4109, Jarvis 
etal. (1996) Nature Biotech. 14:1288-1292, Hollistere* al. (1998) Glycobiology 
5:473-480, Smiths/. (1990) J. Biol. Chem. 265:6225-6234, Grabenhorst et al. 
(1995) Eur. J. Biochem. 232:718-725; U.S. Patent No. 5,047,335; International patent 
5 application publication number WO 98/06835). However, it is recognized in this 
invention that the mere insertion of one or more transferases into cells of interest does 
not ensure sialylation, as there are generally insufficient levels of the donor (CMP- 
SA) and the acceptor substrates, particularly GalGlcNAcMan-R. 

The methods of the present invention permit manipulation of glycoprotein 

1 0 production in cells of interest by enhancing the production of donor nucleotide sugar 
substrate (CMP-SA) and optionally, by introducing and expressing sialyltransferase 
and/or acceptor substrates. By "cells of interest" is intended any cells in which the 
endogenous CMP-SA levels are not sufficient for the production of a desired level of 
sialylated glycoprotein in that cell. The cell of interest can be any eukaryotic or 

1 5 prokaryotic cell. Cells of interest include, for example, insect cells, fungal cells, yeast 
cells, bacterial cells, plant cells, mammalian cells, and the like. Human cells and cell 
lines are also included in the cells of interest and may be utilized according to the 
methods of the present invention to, for example, manipulate sialylated glycoproteins 
in human cells and/or cell lines, such as, for example, kidney, liver, and the like. By 

20 "desired level" is intended that the quantity of a biochemical comprised by the cell of 
interest is altered subsequent to subjecting the cell to the methods of the invention. In 
this manner, the invention comprises manipulating levels of CMP-SA and/or 
sialylated glycoprotein in the cell of interest. In a preferred embodiment of the 
invention, manipulating levels of CMP-SA and sialylated glycoprotein comprise 

25 increasing the levels to above endogenous levels. It is recognized that the increase 
can be from a non-detectable level to any detectable level; or the increase can be from 
a detected endogenous level to a higher level. 

According to the present invention, production of the acceptor substrate is 
achieved by optionally screening a variety of cell lines for desirable processing 

30 enzymes, suppressing unfavorable cleavage reactions that generate truncated 
carbohydrates, and/or by enhancing expression of desired glycosyltransferase 
enzymes such as galactose transferase. Methods of enhancing expression of certain 
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carbohydrate processing enzymes, including but not limited to, glycosyltransferases, 
are described in U.S. Patent No. 5,047.335 and International patent application 
publication number WO 98/06835, the contents of which are herein incorporated by 
reference. 

5 According to the present invention, production of the donor substrate, CMP- 

SA, may be achieved by adding key precursors such as N-acetylmannosamine 
(ManNAc), N-acetylglucosamine (GlcNAc) and glucosamine to cell growth media, 
by enhancing expression of limiting enzymes in CMP-SA production pathway in the 
cells, or any combination thereof. 

1 0 For purposes of the present invention, by "enhancing expression" is intended 

to mean that the translated product of a nucleic acid encoding a desired protein is 
higher than the endogenous level of that protein in the host cell in which the nucleic 
acid is expressed. In a preferred embodiment of the invention, the biological activity 
of a desired carbohydrate processing enzyme is increased by enhancing expression of 

15 the enzyme. 

For the purposes of the invention, by "suppressing activity" is intended to 
mean decreasing the biological activity of an enzyme. In this aspect, the invention 
encompasses reducing the endogenous expression of the enzyme protein, for example, 
by using antisense and/or ribozyme nucleic acid sequences corresponding to the 
20 amino acid sequences of the enzyme; gene knock-out mutagenesis; and/or by 
inhibiting the activity of the enzyme protein, for example, by using chemical 
inhibitors. 

By "endogenous" is intended to mean the type and/or quantity of a biological 
function or a biochemical composition that is present in a naturally occurring or 
25 recombinant cell prior to manipulation of that cell according to the methods of the 
invention. 

By "heterologous" is intended to mean the type and/or quantity of a biological 
function or a biochemical composition that is not present in a naturally occurring or 
recombinant cell prior to manipulation of that cell by the methods of the invention. 
3 0 For purposes the present invention, by "a heterologous polypeptide or protein" 

is meant as a polypeptide or protein expressed (i.e. synthesized) in a cell species of 
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interest that is different from the cell species in which the polypeptide or protein is 
normally expressed (i.e. expressed in nature). 

Methods for determining endogenous and heterologous functions and 
compositions relevant to the invention are provided herein; and otherwise encompass 
5 those methods known in the art. 

Generation of Acceptor Carbohydrate Substrate: GalGlcNAcMan-R: 

According to the methods of the present invention, production of the acceptor 
substrate glycan GalGlcNAcMan-R, is particularly desirable for the sialylation 

10 reaction of N-linked glycoproteins, moreover the terminal Gal is required. Thus, in 
one embodiment of the invention the cells of interest are manipulated (using 
techniques described herein or otherwise known in the art) to contain this substrate. 
For example, for insect cells which principally produce truncated carbohydrates 
terminating in Man or GlcNAc, such cells may routinely be manipulated to produce a 

15 significant fraction of complex oligosaccharides terminating in Gal. Three non 
limiting, non-exclusive approaches that may be routinely applied to produce a 
significant fraction of complex oligosaccharides terminating in Gal include: (1) 
developing screening assays to analyze a selection of insect cell lines for the presence 
of particular carbohydrate processing enzymes; (2) elevating production of Gal- 

20 terminated oligosaccharides by expressing specific enzymes relevant to carbohydrate 
processing pathways; and (3) suppressing carbohydrate processing pathways that 
produce truncated N-linked glycans which cannot serve as acceptors in downstream 
glycosyltransferase reactions. 

Thus, in one embodiment, to produce GalGlcNAcMan-R acceptor substrates 

25 according to the methods of the invention, cell lines of interest are initially, and 

optionally, screened to identify cell lines with the desired endogenous carbohydrate 
production for subsequent metabolic manipulations. More particularly, the screening 
process includes characterizing cell lines for glycosyl transferase activity using 
techniques described herein or otherwise known in the art. Furthermore, it is 

30 recognized that any screened cell line could generate some paucimannosidic 

carbohydrates. Accordingly, the screening process also includes using techniques 
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described herein or otherwise known in the art to characterize cell lines for particular 
glycosidase activity leading to production of paucimannosidic structures. 

Thus, in another embodiment, for the production of the acceptor substrates, the 
invention encompasses utilizing methods described herein or otherwise known in the 
5 art to enhance the expression of one or more transferases. Such methods include, but 
are not limited to, methods that enhance expression of Gal T, GlcNAc -TI and -Til or 
any combination thereof" for example, as described in International patent application 
publication number WO 98/06835 and U.S. Patent No. 5,047,335. 

Thus, in another embodiment, concentrations of acceptor substrates are 
1 0 increased by using methods described herein or otherwise known in the art to 

suppress the activity of one or more endogenous glycosidases. By way of example, 
an endogenous glycosidase, the activity of which may be suppressed accoreding to the 
methods of the invention includes, but is not limited to, the hexosaminidase, N- 
acetylglucosaminidase (an enzyme that degrades the substrate required for 
15 oligosaccharide elongation). 

Thus, the invention encompasses enhancing metabolic pathways that produce 
the desired acceptor carbohydrates and/or suppressing those pathways that produce 
truncated acceptors. 

20 Characterizing cell lines usin g enzyme screening assay 

The cell lines of interest produce different N-glycan structures. Thus, such 
cells can routinely be screened using techniques described herein or otherwise known 
in the art to determine the presence of carbohydrate processing enzymes of interest. 
In insect cells, for example, different insect cell lines produce very different N-glycan 

25 structures (Jarvis et al. (1995) Virology 212:500-51 1, Hsu et at (1997) J. Biol. Chem. 
272:9062-9070, Nishimuraefaf. (1996) Bioorg. Med. Chem. 4:91-96). However, 
only a few cell lines have been characterized, in part due to the lack of efficient 
screening assays. The present invention provides methods implementing fluorescence 
energy transfer and Europium fluorescence assays to screen a selection of different 

30 cells of interest, such as, for example, insect cell lines for the presence of critical 
carbohydrate processing enzymes. 
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Analytical bioassays described herein or otherwise known in the art are also 
provided according to the methods of the present invention to detect the presence of 
favorable carbohydrate processing enzymes, including, but not limited to, galactosyl 
transferase (Gal T), GlcNAc transferase I (GlcNAc T I), and sialyltransferase; and to 
5 detect undesirable enzymes including, but not limited to, N-acetylglucosaminidase. 

Where the cells of interest are insect cells, it will be immediately apparent that 
substantial diversity exists among established insect cell lines due to the range of 
species and tissues from which these lines were derived. Many of these lines can 
routinely be infected by the baculovirus, Autographa californica nuclear polyhedrosis 

10 virus (AcMNPV), and used for the production of heterologous proteins. However, 
only a few cell lines are routinely used for recombinant protein production using 
techniques described herein or otherwise known in the art. These cell lines will be 
immediately apparent by one skilled in the art. It is recognized that any cell line can 
be screened for specific carbohydrate processing enzymes, and manipulated for the 

1 5 purposes of the present invention. Examples of such cell lines include, but are not 
limited to, insect cell lines, including but not limited to, Spodoptera frugiperda (e.g. 
Sf-9 or Sf-21 cells), Trichoplnsia ni (T. ni), and Estigmene acrea (Ea4). Spodoptera 
frugiperda lines (Sf-9 or Sf-21) are the most widely used cell lines and a significant 
amount information is known about the oligosaccharide processing in these cells. 

20 Trichoplusia ni (e.g. High Five™ cells; Invitrogen Corp., Carlsbad, CA, USA) cells 
have been shown to secrete high yields of heterologous proteins with attached hybrid 
and complex N-glycans (Davis etal. (1993) In Vitro Cell Dev. Biol. 29:842-846). 
Estigmena acrea (Ea-4) have been used to generate hybrid and complex N-linked 
oligosaccharides terminating in GlcNAc and Gal residues (Oganah et al. (1996) 

25 Bio/Technology 14:197-202). 

Drosophila Schneider S2 cell lines represent another insect cell line used for 
the production of heterologous proteins. Though these cells cannot be infected by the 
AcNPV expression vector, they are used for production of heterologous proteins via 
an alternative technology known in the art. These cell lines represent other insect cell 

30 line candidates whose glycosylation processing characteristics may be modified to 
include sialylation. 
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In insect cells, paucimannosidic structures are produced by a membrane-bound 
N-acetylglucosaminidase, which removes terminal GlcNAc residues from the 
alpha{\,3) arm of the trimannosyl core (Altmann et al. (1995) J. Biol. Chem. 
270:17344-17349). This trimannosyl core structure lacks the proper termini required 
5 for conversion of side chains to sialylated complex structures; therefore, suppression 
of the N-acetylglucosaminidase activity can reduce or eliminate the formation of these 
undesired oligosaccharide structures, as illustrated in Figure 17. 

To reduce the N-acetylglucosaminidase activity in the target insect cell line(s), 
the invention provides vectors encoding N-acetylglucosaminidase or other 

10 glucosaminidase cDNAs in the antisense orientation and/or, vectors encoding 

ribozymes and/or, vectors containing sequences capable of "knocking out" the N- 
acetylglucosaminidase other glucosaminidase genes via homologous recombination. 
Expression plasmids described herein or otherwise known in the art are constructed 
using techniques known in the art to produce stably-transformed insect cells that 

15 constitutively express the antisense construct and/or ribozyme construct to suppress 
translation of N-acetylglucosaminidase other glucosaminidases or alternatively, to use 
homologous recombination techniques known in the art are to "knock-out" the N- 
acetylglucosaminidase other glucosaminidase genes. Particular sequences to be used 
in the antisense and/or ribozyme construction are described herein, for example, in 

20 Example 4. Techniques described herein or otherwise known in the art may be 

routinely applied to analyze N-linked oligosaccharide structures and to determine if 
N-glycan processing is altered and of the number of paucimannosidic structures in 
these cells is reduced. 

Antisense technology can be used to control gene expression through 

25 antisense DNA or RNA or through triple-helix formation. Antisense techniques are 
discussed, for example, in Okano, J. Neurochem. 56: 560 (1991); 
"Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, 
Boca Raton, FL (1988). Antisense technology can be used to control gene expression 
through antisense DNA or RNA, or through triple-helix formation. Antisense 

30 techniques are discussed for example, in Okano, J., Neurochem. 56:560 (1991); 

Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988). Triple helix formation is discussed in, for instance Lee et al., 
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Nucleic Acids Research 6: 3073 (1979); Cooney et al., Science 241: 456 (1988); and 
Dervan et al., Science 251 : 1360 (1991). The methods are based on binding of a 
polynucleotide to a complementary DNA or RNA. For example, the 5' coding 
portion of a polynucleotide that encodes the amino terminal portion of N- 
5 acetylglucosaminidase and/or other glucosaminidases may be used to design 

antisense RNA oligonucleotides of from about 10 to 40 base pairs in length. A DNA 
oligonucleotide is designed to be complementary to a region of the gene involved in 
transcription thereby preventing transcription and the production of N- 
acetylglucosaminidase and/or other glucosaminidases. The antisense RNA 

10 oligonucleotide hybridizes to the mRNA in vivo and blocks translation of the mRNA 
molecule into N-acetylglucosaminidase and/or other glucosaminidase polypeptides. 
The oligonucleotides described above can also be delivered to cells such that the 
antisense RNA or DNA may be expressed in vivo to inhibit production of N- 
acetylglucosaminidase and/or other glucosaminidases. 

15 In one embodiment, the N-acetylglucosarninidase and/or other 

glucosaminidase antisense nucleic acids of the invention are produced intracellularly 
by transcription from an exogenous sequence. For example, a vector or a portion 
thereof, is transcribed, producing an antisense nucleic acid (RNA) of the invention. 
Such a vector would contain a sequence encoding a N-acetylglucosaminidase and/or 

20 other glucosaminidase antisense nucleic acids. Such a vector can remain episomal or 
become chromosomally integrated, as long as it can be transcribed to produce the 
desired antisense RNA. Such vectors can be constructed by recombinant DNA 
technology methods standard in the ait. Vectors can be plasmid, viral, or others know 
in the art, used for replication and expression in insect, yeast, mammalian, and plant 

25 cells. Expression of the sequences encoding N-acetylglucosaminidase and/or other 
glucosaminidases, or fragments thereof, can be by any promoter known in the art to 
act in insect, yeast, mammalian, and plant cells. Such promoters can be inducible or 
constitutive. Such promoters include, but are not limited to, the baculovirus 
polyhedrin promoter (Luckow et al. (1993) Curr. Opin. Biotech. 4:564-572, Luckow 

30 et al. (1995)), the SV40 early promoter region (Bernoist and Chambon, Nature 
29:304-310 (1981), the promoter contained in the 3' long terminal repeat of Rous 
sarcoma virus (Yamamoto et al, Cell 22:787-797 (1980), the herpes thymidine 
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promoter (Wagner et al., Proc. Natl. Acad. Sci. U.S.A. 78:1441-1445 (1981), the 
regulatory sequences of the metallothionein gene (Brinster, et al., Nature 296:39-42 
(1982)), etc. 

The antisense nucleic acids of the invention comprise sequences 
5 complementary to at least a portion of an RNA transcript of N-acetylglucosaminidase 
and/or other glucosaminidase genes. However, absolute complementarity, although 
preferred, is not required. A sequence "complementary to at least a portion of an 
RNA," referred to herein, means a sequence having sufficient complementarity to be 
able to hybridize with the RNA, forming a stable duplex; in the case of double 
1 0 stranded N-acetylglucosaminidase and/or other glucosaminidase antisense nucleic 
acids, a single strand of the duplex DNA may thus be tested, or triplex formation may 
be assayed. The ability to hybridize will depend on both the degree of 
complementarity and the length of the antisense nucleic acid Generally, the larger the 
hybridizing nucleic acid, the more base mismatches with a N-acetylglucosaminidase 
1 5 and/or other glucosaminidase RNAs it may contain and still form a stable duplex (or 
triplex as the case may be). One skilled in the art can ascertain a tolerable degree of 
mismatch by use of standard procedures to determine the melting point of the 
hybridized complex. 

Oligonucleotides that are complementary to the 5' end of the message, e.g., 
20 the 5' untranslated sequence up to and including the AUG initiation codon, should 
work most efficiently at inhibiting translation. However, sequences complementary 
to the 3' untranslated sequences of mRNAs have been shown to be effective at 
inhibiting translation of mRNAs as well. See generally, Wagner, R., 1994, Nature 
372:333-335. Thus, oligonucleotides complementary to either the 5'- or 3'- non- 
25 translated, non-coding regions of N-acetylglucosaminidase and/or other 

glucosaminidases, could be used in an antisense approach to inhibit translation of 
endogenous N-acetylglucosaminidase and/or other glucosaminidase mRNAs. 
Oligonucleotides complementary to the 5' untranslated region of the mRNA should 
include the complement of the AUG start codon. Antisense oligonucleotides 
30 complementary to mRNA coding regions are less efficient inhibitors of translation 
but could be used in accordance with the invention. Whether designed to hybridize to 
the 5'-, 3'- or coding region of N-acetylglucosarrrinidase and/or other glucosaminidase 
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mRNAs, antisense nucleic acids should be at least six nucleotides in length, and are 
preferably oligonucleotides ranging from 6 to about 50 nucleotides in length. In 
specific aspects the oligonucleotide is at least 10 nucleotides, at least 17 nucleotides, 
at least 25 nucleotides or at least 50 nucleotides. 
5 The polynucleotides of the invention can be DNA or RNA or chimeric 

mixtures or derivatives or modified versions thereof, single-stranded or double- 
stranded. The oligonucleotide can be modified at the base moiety, sugar moiety, or 
phosphate backbone, for example, to improve stability of the molecule, hybridization, 
etc. The oligonucleotide may include other appended groups such as peptides (e.g., 

1 0 for targeting host cell receptors in vivo), agents facilitating transport across the cell 
membrane (see, e.g., Letsinger et al., 1989, Proc. Natl. Acad. Sci. U.S.A. 86:6553- 
6556; Lemaitre et al., Proc. Natl. Acad. Sci. 84:648-652 (1987); PCT Publication No. 
WO88/09810, published December 15, 1988), or hybridization-triggered cleavage 
agents (See, e.g., Krol et al., BioTechniques 6:958-976 (1988)) or intercalating 

1 5 agents. (See, e.g., Zon, Pharm. Res. 5:539-549 (1988)). To this end, the 

oligonucleotide may be conjugated to another molecule, e.g., a peptide, hybridization 
triggered cross-linking agent, transport agent, hybridization-triggered cleavage agent, 
etc. 

The antisense oligonucleotide may comprise at least one modified base moiety 
20 which is selected from the group including, but not limited to, 5-fluorouracil, 5- 

bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xantine, 4-acetylcytosine, 5- 
(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5- 
carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, 
N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2- 
25 methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, 5-methoxycarboxymethyluracil, 5-methoxyuracil, 2- 
methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, 
pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4- 
30 thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic 
acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, 
and 2,6-diaminopurine. 
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The antisense oligonucleotide may also comprise at least one modified sugar 
moiety selected from the group including, but not limited to, arabinose, 
2-fluoroarabinose, xylulose, and hexose. 

In yet another embodiment, the antisense oligonucleotide comprises at least 
5 one modified phosphate backbone selected from the group including, but not limited 
to, a phosphorothioate, a phosphorodithioate, a phosphoramidothioate, a 
phosphoramidate, a phosphordiamidate, a methylphosphonate, an alkyl 
phosphotriester, and a formacetal or analog thereof. 

In yet another embodiment, the antisense oligonucleotide is an alpha-anomeric 
10 oligonucleotide. An alpha -anomeric oligonucleotide forms specific double-stranded 
hybrids with complementary RNA in which, contrary to the usual beta-units, the 
strands run parallel to each other (Gautier et al., Nucl. Acids Res. 1 5:6625-6641 
(1987)). The oligonucleotide is a 2-0-methylribonucleotide (Inoue et al., Nucl. Acids 
Res. 15:6131-6148 (1987)), or a chimeric RNA-DNA analogue (Inoue et al., FEBS 
15 Lett. 215:327-330 (1997)). 

Polynucleotides of the invention may be synthesized by standard methods 
known in the art, e.g. by use of an automated DNA synthesizer (such as are 
commercially available from Biosearch, Applied Biosystems, etc.). As examples, 
phosphorothioate oligonucleotides may be synthesized by the method of Stein et al. 
20 (Nucl. Acids Res. 16:3209 (1988)), methylphosphonate oligonucleotides can be 
prepared by use of controlled pore glass polymer supports (Sarin et al, Proc. Natl. 
Acad. Sci. U.S.A. 85:7448-7451 (1988)), etc. 

While antisense nucleotides complementary to the N-acetylglucosaminidase 
and/or other glucosaminidase coding region sequences could be used, those 
25 complementary to the transcribed untranslated region are most preferred. 

Potential N-acetylglucosaminidase or other glucosaminidase activity 
suppressors according to the invention also include catalytic RNA, or a ribozyme 
(See, e.g., PCT International Publication WO 90/1 1364, published October 4, 1990; 
Sarver et al, Science 247: 1222-1225 (1990). While ribozymes that cleave mRNA at 
30 site specific recognition sequences can be used to destroy N-acetylglucosaminidase 
and/or other glucosaminidase mRNAs, the use of hammerhead ribozymes is 
preferred. Hammerhead ribozymes cleave mRNAs at locations dictated by flanking 
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regions that form complementary base pairs with the target mRNA. The sole 
requirement is that the target mRNA have the following sequence of two bases: 5'- 
UG-3'. The construction and production of hammerhead ribozymes is well known in 
the art and is described more fully in Haseloff and Gerlach, Nature 334:585-591 
5 (1988). Preferably, the ribozyme is engineered so that the cleavage recognition site is 
located near the 5' end of the N-acetylglucosaminidase and/or other glucosaminidase 
mRNAs; i.e., to increase efficiency and minimize the intracellular accumulation of 
non-functional mRNA transcripts. 

As in the antisense approach, the ribozymes of the invention can be composed 

10 of modified oligonucleotides (e.g. for improved stability, targeting, etc.) and should 
be delivered to cells which express N-acetylglucosaminidase and/or other 
glucosaminidases in vivo. DNA constructs encoding the ribozyme may be introduced 
into the cell in the same manner as described above for the introduction of antisense 
encoding DNA. A preferred method of delivery involves using a DNA construct 

15 "encoding" the ribozyme under the control of a strong constitutive promoter, such as, 
for example, pol III or pol II promoter, so that transfected cells will produce sufficient 
quantities of the ribozyme to destroy endogenous N-acetylglucosaminidase and/or 
other glucosaminidase messages and inhibit translation. Since ribozymes unlike 
antisense molecules, are catalytic, a lower intracellular concentration is required for 

20 efficiency. 

Endogenous gene expression can also be reduced by inactivating or "knocking 
out" the N-acetylglucosaminidase and/or other glucosaminidase gene and/or its 
promoter using targeted homologous recombination. (E.g., see Smithies et al, Nature 
317:230-234 (1985); Thomas & Capecchi, Cell 51:503-512 (1987); Thompson et al., 

25 Cell 5:313-321 (1989); each of which is incorporated by reference herein in its 

entirety). For example, a mutant, non-functional polynucleotide of the invention, or a 
completely unrelated DNA sequence (such as for example, a sialic acid synthetase) 
flanked by DNA homologous to the endogenous polynucleotide sequence (either the 
coding regions or regulatory regions of the gene) can be used, with or without a 

30 selectable marker and/or a negative selectable marker, to transfect cells that express 
polypeptides of the invention in vivo. In another embodiment, techniques known in 
the art are used to generate knockouts in cells that contain, but do not express the 
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gene of interest. Insertion of the DNA construct, via targeted homologous 
recombination, results in inactivation of the targeted gene. Such approaches are 
particularly suited in research and agricultural fields where modifications to 
embryonic stem cells can be used to generate animal offspring with an inactive 
5 targeted gene (e.g., see Thomas & Capecchi 1987 and Thompson 1989, supra). The 
contents of each of the documents recited in this paragraph is herein incorporated by 
reference in its entirety. 

The use of chemical inhibitors is also within the scope of the present 
invention, in addition to, or as an alternative to, the antisense approach, and/or the 

10 ribozyme approach, and/or the gene "knock-out" approach, as means for suppressing 
glucosaminidase activity in insect cell cultures. Chemical inhibitors that may be used 
to suppress glucosaminidase activity include, but are not limited to, 2-acetamido- 
l,2,5-trideoxy-l,5 amino-D-glucitol can limit the N-acetylglucosaminidase activity in 
insect cells (Legler etal. (1991) Biochim. Biophys. Acta 1080:80-95, Wagner et al. 

15 (1996) J. Virology 70:4103-4109). In addition, a number of other N- 

acetylglucosaminidase inhibitors may also be used according to the present invention, 
including, but not limited to, nagastatin (with a Ki value in the 10" 8 range) and 
GlcNAc-oxime (K[ in 0.45-22 mM) which are commercially, publicly, or otherwise 
available for the purposes of the present invention (Nishimura et al. (1996) Bioorg. 

20 Med. Chem. 4:91-96, Aoyagi et al. (1992) J. Antibiotics 45:1404-1408). 

The chemical inhibitors mentioned above do not distinguish between 
lysosomal N-acetylglucosaminidase and the target membrane-bound N- 
acetylglucosaminidase activity in the secretory compartment. Thus, a more specific 
inhibitor, based on the substrate structure, is provided to serve not merely as a 

25 competitive inhibitor, but also as an affinity labeling reagent. The chemical structure 
for two possible chemical compounds with specificity for inhibiting membrane-bound 
glucosaminidase one or both of which may be used according to the present invention, 
are shown in Figure 1 9. Subsequent to expression and purification of the N- 
acetylglucosaminidase, the effectiveness of these inhibitors may be tested and 

30 compared in in vitro and/or in vivo trials using techniques described herein or 
otherwise known in the art. As above, these chemical inhibitors are then used in 
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addition to, or as an alternative to, antisense suppression, ribozyme suppression, 
and/or gene knock-out mutagenesis, of glucosaminidase activity in insect cells. 

It is recognized that the suppression of glucosaminidase activity alone may not 
lead to production of the desired acceptor carbohydrate, if the enzymes responsible 
5 for generating structures terminating in Gal are lacking in particular cell lines. Thus, 
according to the methods of the present invention, Gal T activity in insect cells can be 
increased significantly by using techniques described described herein or otherwise 
known in the art to express a heterologous gene using a baculovirus construct 
containing nucleic acid sequences encoding Gal T or a fragment or variant thereof, or 

10 by stably transforming the cells with a gene coding for Gal T or a fragment or variant 
thereof. If N-glycan analysis indicates that lower than a desired level of the acceptor 
substrates are present even following glucosaminidase suppression, techniques 
described herein or otherwise known in the art may be applied to express 
glycosyltransferase enzymes as needed in insect cells to produce a larger fraction of 

15 the desired acceptor structures. Figure 20 depicts that the overexpression of various 
glycosyltransferases leads to greater production of acceptor substrates. 

Alternatively, the expression of glycosyltransferases will serve to limit 
generation of paucimannosidic structures by generating unacceptable glucosaminidase 
substrates terminating in Gal, or by competing against the glucosaminidase reaction 

20 (Wagner et al, Glycobiology 6:165-175 (1996)). 

Thus, the invention comprises expression of glycosyltransferases combined 
with, or as an alternative to, suppression of N-acetylglucosaminidase activity in 
selected insect cell lines to produce desired quantities of carbohydrates containing the 
correct Gal (G) acceptor substrate for sialylation. Figure 21 illustrates, without 

25 limitation, three examples of acceptor N-glycan structures that comprise the terminal 
Gal acceptor residue required for subsequent sialylation. Other desired carbohydrates 
structures with a branch terminating Gal are also possible and are encompassed by the 
invention. 

Baculovirus expression vectors containing the coding sequence for GlcNAc - 
30 TI and -HI, and Gal T or fragments or variants thereof, and stable transfectants 
overexpressing GlcNAc-TI and GlcNAc-TII, and Gal T, or fragments or variants 
thereof are known, can be routinely generated using techniques known in the art, and 
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are commercially, publicly, or otherwise available for the purposes of this invention. 
(See Jarvis et al. (1996) Nature Biotech 14:1288-1292; Hollister et al. (1998) 
Glycobiology 8: 473-480; the contents of which are herein incorporated by reference). 
In addition, stable transfectants expressing GlcNAc-TI and GlcNAc-TII can be 
5 routinely generated using techniques known in the art, if overexpression proves 
desirable. 

Production and delivery of the Donor Substrate: CMP-Sialic Acid (CMP-SA') 
For production of the donor substrate, CMP-SA, the invention provides 
10 methods and compositions comprising expression of limiting enzymes in the CMP- 
SA production pathway; in addition, or as an alternative to, the feeding of precursor 
substrates. 

To produce sialylated N-linked glycoproteins, the donor substrate, CMP-sialic 
acid (CMP-SA), must be synthesized. The structure of CMP-SA is shown in Figure 

15 22. CMP-SA can be enzymatically synthesized from glucose or other simple sugars, 
glutamine, and nucleotides in mammalian cells and E. coli using the metabolic 
pathways shown in Figure 5, and as described in Ferwerda et al. (1983) Biochem. J. 
216-.S7-92; Mahmoudian et al. (1 997) Enzyme and Microbial Technology 20:393-400; 
Schachter et al. (1973) Metabolic Conjugation and Metabolic Hydrolysis (New York 

20 Academic Press) 2-135. 

In some mammalian tissues and cell lines, the production and delivery of 
CMP-SA limits the sialylation capacity of these cells (Gu et al. (1997) Improvement 
of the interferon-gamma sialylation in Chinese hamster ovary cell culture by feeding 
N-acetylmannosamine). This problem is likely to be amplified in insect cells since 

25 negligible sialic acid levels are detected in Trichoplusia ni insect cells as compared to 
levels in Chinese Hamster Ovary (CHO) mammalian cells (Figure 16). Furthermore, 
negligible CMP-SA was observed in Sf-9 and Ea-4 insect cells when compared to 
CHO cells (Hooker et al. (1997) Monitoring the Glycosylation Pathway of 
Recombinant Human Interferon-Gamma Produced by Animal Cells, European 

30 Workshop on Animal Cell Engineering, Costa Brava, Spain; and Jenkins (1998) 
Restructuring the Carbohydrates of Recombinant Glycoproteins, Cell Culture 
Engineering VI, San Diego, CA). These findings are relevant in light of the 
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previously published observation that polysialic acid can be detected in Drosophila 
embryos (Roth et al. (1992) Science 256:673-675) and the observation of sialylated 
glycoproteins produced by other insect cells (Davidson et al. (1990) Biochemistry 
29:5584-5590). 

5 Production of sialic acid (SA), more specifically N-acetylneuraminic acid 

(NeuAc), from the precursor substrate ManNAc can proceed through three alternative 
pathways shown in Figure 5. The principal pathway for the production of SA in E. 
coli and other bacteria utilizes the phosphoenylpyruvate (PEP) and ManNAc to 
produce sialic acids in the presence of sialic acid synthetase (Vann et al (1997) 

10 Glycobiology 7:697-701). A second pathway, observed in bacteria and mammals, 
involves the reversible conversion by aldolase (also named N-acetylneuraminate 
lyase) of ManNAc and pyruvate to sialic acid (Schachter et al. (1973) Metabolic 
Conjugation and metabolic Hydrolysis (New York Academic Press) 2-135, Lilley et 
al. (1992) Prot. Expr. and Pur. 3:434-440). The aldolation reaction equilibrates 

1 5 toward ManNAc but can be manipulated to favor the production of sialic acid by the 
addition of excess ManNAc or pyruvate in vitro (Mahmoudian et al. (1997) Enzyme 
and Microbial Technology 20:393-400). The third pathway, observed only in 
mammalian tissue, begins with the ATP driven phosphorylation of ManNAc, and is 
followed by the enzymatic conversion of phosphorylated ManNAc to a 

20 phosphorylated form of sialic acid, from which the phosphate is removed in a 

subsequent step (van Rinsum et al. (1983) Biochem. J. 210:21-28, Schachter et al. 
(1973) Metabolic Conjugation and metabolic Hydrolysis (New York Academic 
Press) 2-135). 

According to one embodiment of the invention, to overcome intracellular 
25 limitations of CMP-SA in mammalian cells, feeding of alternative precursor 

substrates may be applied to eliminate or reduce the need to produce CMP-SA from 
simple sugars (see Example 6). Since CMP-SA and its direct precursor, SA, are not 
permeable to cell membranes (Bennetts et al. (1981) J. Cell. Biol. 88:1-15), these 
substrates cannot be added to the culture medium for uptake by the cell. However, 
30 other precursors, including N-acetylmannosamine (ManNAc), glucosamine, and N- 
acetylglucosamine (GlcNAc) when added to the culture medium are absorbed into 
mammalian cells (see Example 6). See, for example, Gu et al. (1997) Improvement of 
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the interferon-gamma sialylation in Chinese hamster ovary cell culture by feeding N- 
acetylmannosamine, Zanghi et al (1997) European Workshop on Animal Cell 
Engineering, Ferwerda et al. (1983) Biochem. J. 216:87-92, Kohn et al. (1962) /. 
Biol. Chem. 237:304-308, Thomas etal. (1985) Biochim. Biophys. Acta 846:37-43, 
Bennetts etal. (1981) J. Cell. Biol 88:1-15. The substrates are then enzymatically 
converted to CMP-SA and incorporated into homologous and heterologous 
glycoproteins (Gu et al. (1997) Improvement of the interferon-gamma sialylation in 
Chinese hamster ovary cell culture by feeding N-acetylmannosamine, Ferwerda et al. 
(1983) Biochem. J. 216:87-92, Kahnetal. (1962) J! Biol Chem. 237:304-308, 
Bennetts et al. (1981) J. Cell. Biol 88:1-15). 

To be incorporated into oligosaccharides, sialic acid and cytidine triphosphate 
(CTP) must be converted to CMP-SA by the enzyme, CMP-sialic acid (CMP-SA) 
synthetase (Schachter etal (1973) Metabolic Conjugation and metabolic Hydrolysis 
(New York Academic Press) 2-135): 

Sialic Acid + CTP -»CMP-SA + PPi 

This enzyme has been cloned and sequenced from E. coli and used for the in 
vitro production of CMP-SA, as described in Zapata et al. (1989) J. Biol. Chem. 
264:14769-14774, KMemanetal. (1995) Appl. Microbiol Biotechnol 44:59-67, 
Ichikawa etal (1992) Anal. Biochem. 202:215-238, Shames etal (1991) 
Glycobiology 1:187-191; the contents of which are herein incorporated by reference). 

In eukaryotes, the activated sugar nucleotide, CMP-SA, must be transported 
into the Golgi lumen for sialylation to proceed (Deutscher et al. (1984) Cell 39:295- 
299). Transport through the trans-Golgi membrane is facilitated by the CMP-SA 
transporter protein, which was identified by complementation cloning into sialylation 
deficient CHO cells (Eckhardt etal. (1996) Proc. Natl. Acad. Set USA 93:7572- 
7576). This mammalian gene has also been cloned and expressed in a functional form 
in the heterologous host, S. cerevisiae (Bernisone etal (1997) J. Biol Chem. 
272:12616-12619). 

In addition to feeding of external precursor substrates such as ManNAc, 
GlcNAc, or glucosamine to increase CMP-SA levels, a supplementary approach in 
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which CMP-SA transporter genes are introduced and expressed using routine 
recombinant DNA techniques may also be employed according to the methods of the 
present invention. These techniques are optionally combined with ManNAc, 
GlcNAc, or glucosamine feeding strategies described above, to maximize CMP-SA 
5 production. 

Conversion of GlcNAc or glucosamine to ManNAc 

Also according to the methods of the present invention, where the utilization 
of GlcNAc or glucosamine is preferred and ManNAc is not generated naturally in 

10 insect cells, ManNAc can be produced chemically using sodium hydroxide 
(Mahmoudian et al. (1 997) Enzyme and Microbial Technology 20:393-400). 
Alternatively, the enzymes that convert these substrates to ManNAc or fragments or 
variants of these enzymes, can be expressed in insect cells using techniques described 
herein or otherwise known in the art. The production of ManNAc from GlcNAc and 

1 5 glucosamine proceeds through the metabolic pathway shown in Figure 23 . 

Two approaches are provided to accomplish this conversion: (a) direct 
epimerization of GlcNAc; or (b) conversion of GlcNAc or glucosamine to UDP-N- 
acetylglucosamine (UDP-GlcNAc), and then ManNAc. According to one embodiment 
of the invention, approach (a) is achieved using the gene encoding a GlcNAc-2- 

20 epimerase isolated from pig kidney, or fragments or variants thereof, to directly 
convert GlcNAc to ManNAc (See Maru et al. (1996) J. Biol. Chem. 271 : 16294- 
16299; the contents of which are herein incorporated by reference). Additionally, the 
sequence for a homologue of this enzyme can be routinely obtained from 
bioinformatics databases, and cloned into baculovirus vectors, or stably integrated 

25 into insect cells using techniques described herein or otherwise known in the art. 

Alternatively, approach (b) requires insertion of the gene to convert UDP- 
GlcNAc to ManNAc. Engineering the production of UDP-GlcNAc from glucosamine 
or GlcNAc is likely not required since most insect cells comprise metabolic pathways 
to synthesize UDP-GlcNAc; as indicated by the presence of GlcNAc-containing 

30 oligosaccharides. According to one embodiment of the invention, the gene encoding 
a rat Afunctional enzyme coding for conversion of UDP-GlcNAc to ManNAc and 
ManNAc to ManNAc-6-P, or fragments or variants thereof is used to engineer the 
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production of UDP-GlcNAc using techniques described herein or otherwise known in 
the art (Stasche et al. (1997) J. Biol. Chem. 272:24319-24324, the contents which are 
herein incorporated by reference). In a specific embodiment, the segment of this 
enzyme responsible for conversion of UDP-GlNAc to ManNAc may be expressed 
5 independently in insect cells using techniques known in the art to produce ManNAc 
rather than ManNAc-6-P. 

Conversion of ManNAc to SA 

Once ManNAc is generated, it is converted to SA according to the methods of 

10 the invention. There are three possible metabolic pathways for the conversion of 
ManNAc to SA in bacteria and mammals, as shown in Figure 24. Negligible SA 
levels have previously been observed in insect cells (in the absence of exogenous 
supplementation of ManNAc to the culture media). 

The conversion of ManNAc and PEP to SA using sialic acid synthetase is the 

1 5 predominant pathway for SA production in E. coli (Vann et al. (1997) Glycobiology 
7:697-701). The E. coli sialic acid (SA) synthetase gene NeuB (SEQ ID NO:7 and 8) 
has been cloned and sequenced and is commercially, publicly, and/or otherwise 
available for the purposes of the present invention. Additionally, as disclosed herein, 
the human sialic acid synthetase gene has also been cloned (cDNA clone HA5AA37), 

20 sequenced, and deposited with the American Type Culture Collection ("ATCC") on 

February 24, 2000 and was given the ATCC Deposit Number . (The 

ATCC is located at 10801 University Boulevard, Manassas, VA 201 10-2209, USA. 
ATCC deposits were made pursuant to the terms of the Budapest Treaty on the 
international recognition of the deposit of microorganisms for purposes of patent 

25 procedure.) Thus, for enhancing expression of SA synthetase according to certain 
embodiments of the invention, the nucleic acid compositions encoding a SA 
synthetase such as, for example, an E.coli and/or human sialic acid synthetase and/or 
a fragment or variant thereof, may be inserted into a host expression vector or into the 
host genome using techniques described herein or otherwise known in the art. 

30 According to the methods of the invention, the production of SA can also be achieved 
from ManNAc and pyruvate using an aldolase, such as, for example, bacterial 
aldolase (Mahmoudian et al. (1997) Enzyme and Microbial Technology 20:393-400), 
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or a human aldolase (as described herein) or fragment or variant thereof. The human 
aldolase gene has been cloned (cDNA clone HDPAK85), sequenced, and deposited 
with the American Type Culture Collection ("ATCC") on February 24, 2000 and was 

given the ATCC Deposit Number . Thus, the aldolase enzyme is 

5 considered as an alternative for converting ManNAc to SA. For enhancing expression 
of aldolase, the aldolase sequences can be amplified directly from E. coli and human 
DNA using primers and PCR amplification as described in Mahmoudian et al. 
(Mahmoudian et al. (1997) Enzyme and Microbial Technology 20:393-400); the 
contents of which are herein incorporated by reference) and herein, and using 

10 techniques described herein or otherwise known in the art to enhance expression of 
aldolase, or a fragment or variant thereof. Since the aldolase reaction is reversible, 
high levels of added ManNAc and pyruvate, may be used according to the methods of 
the invention to drive this reversible reaction in the direction of the product SA 
(Mahmoudian et al. (1997) Enzyme and Microbial Technology 20:393-400). 

15 In addition to the pathways which convert ManNAc to SA present in both 

prokaryotes and eukaryotes, an exclusively eukaryotic pathway may also employed 
according to the methods of the invention to convert ManNAc to SA through the 
phosphate intermediates ManNAc-6-phosphate and SA-9-phosphate. It is recognized 
that the mammalian enzymes (synthetase and phosphatase) responsible for converting 

20 ManNAc to SA through phosphate intermediates can be utilized for engineering this 
eukaryotic pathway into insect cells. 

Conversion ofSA to CMP-SA 

The methods of the invention also encompass the use of CMP-SA synthetase 

25 to enzymatically converts SA to CMP-SA (see, e.g., the reaction shown in Figure 25). 
However, insect cells, such as, for example, Sf9 insect cells, have negligible 
endogenous CMP-SA synthetase activity. Evidence of limited CMP-SA synthetase in 
insect cells is also demonstrated by increased SA levels found following substrate 
feeding and genetic manipulation without a concomitant increase in CMP-SA. 

30 Thus, specific embodiments of the invention provide methods for enhancing 

the expression of CMP-SA synthetase, and/or fragments or variants thereof. Bacterial 
CMP-SA synthetase has been cloned and sequenced as described in Zapata et al. 
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(1989) J. Biol Chem. 264:14769-14774; the contents of which are herein incorporated 
by reference. Additionally, as described herein the gene encoding human CMP-SA 
synthetase has also been cloned (cDNA clone HWLLM34), sequenced and deposited 
with the American Type Culture Collection ("ATCC") on February 24, 2000 and was 

given the ATCC Deposit Number _. Thus, in specific embodiments, the 

methods of the present invention provide for enhancing expression of bacterial or 
human CMP-SA synthetase or fragments, or variants thereof, in cells of interest, such 
as, for example, in insect cells, using techniques described herein, or otherwise known 
in the art. 



Golgi transport of CMP-SA 

CMP-SA must be delivered into the Golgi apparatus in order for sialyiation to 
occur, and this transport process depends on the presence of the CMP-SA transporter 
protein (Deutscher et al. (1984) Cell 39:295-299). To determine if CMP-SA 
1 5 synthesized in insect cells is efficiently transported into the proper cellular 

• compartment, insect cell vesicles are prepared and transport of CMP-SA is measured 
as described in (Bernisone et al. (1997) J. Biol. Chem. 272:12616-12619) and/or using 
techniques otherwise known in the art. Where the native enzymatic transport is lower 
than desired, a transporter enzyme is cloned and expressed in insect cells using the 
known mammalian gene sequence (as described in Bernisone et al. (1997) J. Biol. 
Chem. 272:12616-12619, Eckhardte^/. (1996) Proc. Natl. Acad Sci. USA 93:7572- 
7576; the contents of which are herein incorporated by reference) and/or sequences 
otherwise known in the art. Corresponding sequences are available from 
bioinformatics databases for the purposes of this invention. Localization of the 
protein to the Golgi is evaluated using an antibody generated against the heterologous 
protein using techniques known in the art in concert with commercially available 
fluorescent probes that identify the Golgi apparatus. 

Expression cloning of multiple transcripts (for example, transcripts encoding 
CMP-SA pathway enzymes, glycosyl transferases, and ribozymes or anti-sense RNAs 
to suppress hexosaminidases) in a single cell line using techniques known in the art 
may be required to bring about the desired sialyiation reactions and/or to optimize 
these reactions. Alternatively, co-infection of cells with multiple viruses using 
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techniques known in the art can also be used to simultaneously produce multiple 
recombinant transcripts. In addition, plasmids that incorporate multiple foreign genes 
including some under the control of the early promoter IE1 are commercially, 
publicly, or otherwise available for the puiposes of the invention, and can be used to 
5 create baculovirus constructs. The present invention encompasses using any of these 
techniques. The invention also encompasses using the above mentioned types of 
vectors to enable expression of desired carbohydrate processing enzymes in 
baculovirus infected insect cells prior to production of a heterologous glycoprotein of 
interest under control of the very late polyhedrin promoter. In this manner, once the 

1 0 desired polypeptide is synthesized essential N-glycan processing enzymes can 
facilitate N-glycan processing once the glycoprotein of interest. 

Alternatively, genes for some of the enzymes may be incorporated directly 
into the insect cell genome using vectors known in the art, such as, for example, 
vectors similar to those described in (Jarvis et al. (1990) Bio/Technology 8:950-955, 

1 5 Jarvis et al. (1 995) Baculovirus Expr. Protocols ed. 39:1 87-202). Genomic 
integration eliminates the need to infect the cells with a large number of viral 
constructs. These constructs for genomic integration contain one or more early viral 
promoters, including AcMNPV IE1 and 39K, which provide constitutive expression 
in transfected insect cells (Jarvis et al. (1990) Bio/Technology 8:950-955). In 

20 addition, a sequential transformation strategy may routinely be developed for 
producing stable transformants that constitutively express up to four different 
heterologous genes simultaneously. These vectors and transformation techniques are 
provided for the purposes of this invention. In this manner, incorporation of plasmids 
containing heterologous genes into the insect cell genome combined with baculovirus 

25 infection integrates the metabolic pathways leading to efficient acceptor and donor 
substrate production in insect cells. 

Generation of N-linked sialvlated glycoproteins 

The final step in the generation of sialylated glycoproteins or glycolipids in 
30 mammalian cells is the enzymatic transfer of sialic acid from the donor substrate, 
CMP-SA, onto an acceptor substrate in the Golgi apparatus; a reaction which is 
catalyzed by sialyltransferase. The sialic acid (SA) residues occurring in N-linked 
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glycoproteins are alpha-linked to the 3 or 6 position of the GalGlcNAc sugars (Tsuji, 
S. (1996) J. Biochem. 120:1-13). The SA alphal-3 GalGlcNAc linkage is found in 
heterologous glycoproteins expressed by CHO and human cells and the SA alphal- 
6GalGlcNAc linkage is found in many human glycoproteins (Goochee et al. (1991) 
5 Bio/technology 9:1347-1355). The alphal-'i- and/or <2/p/za2-6-sialyltransferase genes 
along with a number of other sialyltransferase genes have been cloned, sequenced and 
expressed as active heterologous proteins as described in Lee et al. (1989) J. Biol. 
Chem. 264:13848-13855, IcMkawa ef of. {\992)Anal. Biochem. 202:215-238, Tsuji, 
S. (1996) J. Biochem. 120:1-13; U.S. Patent No. 5,047,335, the contents of which are 

1 0 herein incorporated by reference. Any one or more of these genes, as well as 

fragments, and/or variants thereof may be introduced and expressed in cells of interest 
using techniques described herein or otherwise known in the art, and may be used 
according to the methods of the present invention to enhance the enzymatic transfer of 
sialic acid from the donor substrate. 

15 For generating N-Linked sialylated glycoproteins in insect cells, once the 

donor (CMP-SA) and acceptor (GalGlcNAc-R) substrates are produced as described 
above, the methods of the invention further comprise expression of a sialyltransferase 
or fragment or variant thereof, in the cells. The completion of the sialylation reaction 
can be verified by elucidating the N-glycan structures attached to a desired 

20 glycoprotein using techniques described herein or otherwise known in the art. It is 
recognized that evaluation of N-glycans attachments may also suggest additional 
metabolic engineering strategies that can further enhance the level of sialylation in 
insect cells. 

It is observed that unmodified T. ni insect cell lysates failed to generate any 
25 sialylated compounds when incubated with the substrate, LacMU, and the nucleotide 
sugar, CMP-SA. Thus, it is concluded that these cells comprise negligible native 
sialyltransferase activity. However, infection of insect cells with a baculovirus 
containing alpha2,3 sialyltransferase provided significant enzymatic conversion of 
LacMU and CMP-SA to sialylLacMU. For the purposes of the invention, 
30 heterologous sialyltransferase can be expressed using techniques described herein or 
otherwise known in the art either by co-infection with a virus coding for 
sialyltransferase, or fragment, or variant thereof, or by using stable transfectants 
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expressing the enzyme. In addition to the 2,3 sialyltransferase baculovirus constructs, 
baculovirus vectors comprising sequences coding for alpha2,6 sialyltransferase and/or 
fragments or variants thereof as well as stably transformed insect cells stably 
expressing both gal T and sialyltransferase are commercially, or publicly available, 
5 and/or may routinely be generated using techniques described herein or otherwise 
known in the art. Evaluation of sialyltransferase activity is determined using the 
FRET or HPLC assays described herein and/or using other assays known in the art. 
Localization of the sialyltransferase to the Golgi is accomplished using anti- 
sialyltransferase antibodies commercially, publicly, or otherwise available for the 

10 purpose of this invention in concert with Golgi specific marker proteins. 

For the purposes of enhancing carbohydrate processing enzymes of the 
invention, suppressing activity of endogenous N-acelylglucosaminidase, expressing 
heterologous proteins in the cells of the invention, and constructing vectors for the 
purposes of the invention; genetic engineering methods are known to those of 

1 5 ordinary skill in the art. For example, see Schneider, A. et ah, (1998) Mol. Gen. 
Genet. 257:308-318. Where the invention encompasses utilizing baculovirus based 
expression, such methods are known in the art, for example, as described in O'Riley 
et ah (1992) Baculovirus Expression Vectors, W.H. Freeman and Company, New 
York 1992. 

20 For the purposes of enhancing carbohydrate processing enzymes of the 

invention, suppressing activity of endogenous N-acelylglucosaminidase, expressing 
heterologous proteins in the cells of the invention, and constructing vectors as 
described herein, known sequences can be utilized in the methods of the invention, 
including but not limited to the sequences described in GenSeq accession No. Zl 1234 

25 and Zl 1235 for two human galactosyltransferases (see also United States Patent 
Number 5,955,282; the contents of which are herein incorporated by reference); 
and/or in Genbank accession No. D83766 for GlcNAc-2-epimerase, Y07744 for the 
bifunctional rate liver enzyme capable of catalyzing conversion of UDP-GlcNAc to 
ManNAc, J05023 for E. coli CMP-SA synthetase, AJ006215 for murine CMP-SA 

30 synthetase, Z71268 for murine CMP-SA transporter, X03345 for E. coli aldolase, 
U05248 for E. coli SA synthetase, X17247 for human 2,6 sialyltransferase, L29553 
for human 2,3 sialyltransferase, Ml 3214 for bovine galactosyltransferase, L77081 for 
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human GlcNAc T-I, U15128 or L36537 for human GlcNAc T-II, D87969 for human 
CMP-SA transporter, and S95936 for human transferrin; and fragments or variants of 
the enzymes that display one or more of the biological activities of the enzymes (such 
biological activities may routinely be assayed using techniques described herein or 
5 otherwise known in the art). The sequences described above are readily accessible 
using the provided accession number in the NCBI Entrez database, known to the 
person of ordinary skill in the art. 

Thus, one aspect of the invention provides for use of isolated nucleic acid 
molecules comprising polynucleotides having nucleotide sequences selected from the 

10 group consisting of : (a) nucleotide sequences encoding a biologically active 

fragment or variant of the polypeptide having the amino acid sequence described in 
GenSeq accession No. Zl 1234 and Zl 1235 for two human galactosyltransferases; 
and/or in Genbank accession No. D83766 for GlcNAc-2-epimerase, Y07744 for the 
bifunctional rate liver enzyme capable of catalyzing conversion of UDP-GlcNAc to 

1 5 ManNAc, J05023 for E. coli CMP-SA synthetase, AJ0062 1 5 for murine CMP-SA 
synthetase, Z71268. for murine CMP-SA transporter, X03345 for E. coli aldolase, 
U05248 for E. coli SA synthetase, X17247 for human 2,6 sialyltransferase, L29553 
for human 2,3 sialyltransferase, Ml 32 14 for bovine galactosyltransferase, L77081 for 
human GlcNAc T-I, U15128 or L36537 for human GlcNAc T-II, D87969 for human 

20 CMP-SA transporter, and/or S95936 for human transferrin; (b) nucleotide sequences 
encoding an antigenic fragment of the polypeptide having the amino acid sequence 
described in GenSeq accession No. Zl 1234 and Zl 1235 for two human 
galactosyltransferases (see also United States Patent Number 5,955,282; the contents 
of which are herein incorporated by reference); and/or in Genbank accession No. 

25 D83766 for GlcNAc-2-epimerase, Y07744 for the bifunctional rate liver enzyme 
capable of catalyzing conversion of UDP-GlcNAc to ManNAc, J 05023 for E. coli 
CMP-SA synthetase, AJ006215 for murine CMP-SA synthetase, Z71268 for murine 
CMP-SA transporter, X03345 for E. coli aldolase, U05248 for E. coli SA synthetase, 
XI 7247 for human 2,6 sialyltransferase, L29553 for human 2,3 sialyltransferase, 

30 M13214 for bovine galactosyltransferase, L77081 for human GlcNAc T-I, U15128 or 
L36537 for human GlcNAc T-II, D87969 for human CMP-SA transporter, and/or 
S95936 for human transferrin; and (c) nucleotide sequences complementary to any of 
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the nucleotide sequences in (a) or (b), above. Polypeptides encoded by such nucleic 
acids may also be used according to the methods of the present invention. Further 
embodiments of the invention include use of isolated nucleic acid molecules that 
comprise a polynucleotide having a nucleotide sequence at least 80%, 85%, or 90% 
5 identical, and more preferably at least 95%, 97%, 98% or 99% identical, to any of the 
above nucleotide sequences, or a polynucleotide which hybridizes under stringent 
hybridization conditions to a polynucleotide that is complementary to any of the 
above nucleotide sequences. This polynucleotide which hybridizes does not hybridize 
under stringent hybridization conditions to a polynucleotide having a nucleotide 

1 0 sequence consisting of only A residues or of only T residues. Polypeptides encoded 
by such nucleic acids may also be used according to the methods of the present 
invention. Preferably, the nucleic acid sequences (including fragments or variants) 
that may be used according to the methods of the present invention encode a 
polypeptide having a biological activity. Such biological activity may routinely be 

1 5 assayed using techniques described herein or otherwise known in the art. 

In addition to the sequences described above, the nucleotide sequences and . 
amino acid sequences disclosed in Figures 27-32, and fragments and variants of these 
sequences may also be used according to the methods of the invention. 

In one embodiment, specific en^me polypeptides comprise the amino acid 

20 sequences shown in Figures 28, 30 and 32; or otherwise described herein. However, 
the invention also encompasses sequence variants of the polypeptide sequences shown 
in Figures 28, 30 and 32. 

In a specific embodiment, one, two, three, four, five or more human 
polynucleotide sequences, or fragments, or variants thereof, and/or the polypeptides 

25 encoded thereby, are used according to the methods of the present invention to 
convert ManNAc to SA (see Example 6). Such polynucleotide and polypeptide 
sequences include, but are not limited to, sequences corresponding to human aldolase 
(SEQ ID NO: 1 and SEQ ID NO:2), human CMP-SA synthetase (SEQ ID NO:3 and 
SEQ ID NO:4), and human SA synthetase (SEQ ID NO:5 and SEQ ID NO:6); see 

30 also Figures 27 - 32. Thus, in certain embodiments the methods of present invention 
include the use of one or more novel isolated nucleic acid molecules comprising 
polynucleotides encoding polypeptides important to intracellular carbohydrate 
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processing in humans. Such polynucleotide sequences include those disclosed in the 
figures and/or Sequence Listing and/or encoded by the human cDNA plasmids 
(Human CMP-Sialic Acid Synthetase, cDNA clone HWLLM34; Human Sialic Acid 
Synthetase, cDNA clone HA5AA37; and Human Aldolase cDNA clone HDPAK85) 
5 deposited with the American Type Culture Collection (ATCC) on February 24, 2000 

and receiving accession numbers . The present invention further includes 

the use of polypeptides encoded by these polynucleotides. The present invention also 
provides for use of isolated nucleic acid molecules encoding fragments and variants of 
these polypeptides, and for the polypeptides encoded by these nucleic acids. 

1 0 Thus, one aspect of the invention provides for use of isolated nucleic acid 

molecules comprising polynucleotides having nucleotide sequences selected from the 
group consisting of : (a) nucleotide sequences encoding human aldolase having the 
amino acid sequences as shown in SEQ ID NO:2; (b) nucleotide sequences encoding 
a biologically active fragment of the human aldolase polypeptide having the amino 

15 acid sequence shown in SEQ ID NO:2; (c) nucleotide sequences encoding an 

antigenic fragment of the human aldolase polypeptide having the amino acid sequence 
shown in SEQ ID NO:2; (d) nucleotide sequences encoding the human aldolase 
polypeptide comprising the complete amino acid sequence encoded by the plasmid 
contained in the ATCC Deposit; (e) nucleotide sequences encoding a biologically 

20 active fragment of the human aldolase polypeptide having the amino acid sequence 
encoded by the plasmid contained in the ATCC Deposit; (f) a nucleotide sequence 
encoding an antigenic fragment of the human aldolase polypeptide having the amino 
acid sequence encoded by the plasmid contained in the ATCC Deposit; and (g) 
nucleotide sequences complementary to any of the nucleotide sequences in (a) 

25 through (f), above. Polypeptides encoded by such nucleic acids may also be used 
according to the methods of the present invention. Further embodiments of the 
invention include use of isolated nucleic acid molecules that comprise a 
polynucleotide having a nucleotide sequence at least 80%, 85%, or 90% identical, and 
more preferably at least 95%, 97%, 98% or 99% identical, to any of the nucleotide 

30 sequences in (a), (b), (c), (d), (e), (f), or (g), above, or a polynucleotide which 

hybridizes under stringent hybridization conditions to a polynucleotide in (a), (b), (c), 
(d), (e), (f), or (g), above. This polynucleotide which hybridizes does not hybridize 
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under stringent hybridization conditions to a polynucleotide having a nucleotide 
sequence consisting of only A residues or of only T residues. Polypeptides encoded 
by such nucleic acids may also be used according to the methods of the present 
invention. 

5 Another aspect of the invention provides for use of isolated nucleic acid 

molecules comprising polynucleotides having nucleotide sequences selected from the 
group consisting of : (a) nucleotide sequences encoding human CMP-SA synthetase 
having the amino acid sequences as shown in SEQ ID NO:4; (b) nucleotide 
sequences encoding a biologically active fragment of human CMP-SA synthetase 

10 polypeptide having the amino acid sequence shown in SEQ ID NO:4; (c) nucleotide 
sequences encoding an antigenic fragment of the human CMP-SA synthetase 
polypeptide having the amino acid sequence shown in SEQ ID NO:4; (d) nucleotide 
sequences encoding the human CMP-SA synthetase polypeptide comprising the 
complete amino acid sequence encoded by the plasmid contained in the ATCC 

15 Deposit; (e) nucleotide sequences encoding a biologically active fragment of the 

human CMP-SA synthetase polypeptide having the amino acid sequence encoded by 
the plasmid contained in the ATCC Deposit; (f) a nucleotide sequence encoding an 
antigenic fragment of the human CMP-SA synthetase polypeptide having the amino 
acid sequence encoded by the plasmid contained in the ATCC Deposit; and (g) 

20 nucleotide sequences complementary to any of the nucleotide sequences in (a) 
through (f), above. Polypeptides encoded by such nucleic acids may also be used 
according to the methods of the present invention. Further embodiments of the 
invention include use of isolated nucleic acid molecules that comprise a 
polynucleotide having a nucleotide sequence at least 80%, 85%, or 90% identical, and 

25 more preferably at least 95%, 97%, 98% or 99% identical, to any of the nucleotide 
sequences in (a), (b), (c), (d), (e), (f), or (g) above, or a polynucleotide which 
hybridizes under stringent hybridization conditions to a polynucleotide in (a), (b), (c), 
(d), (e), (f), or (g), above. This polynucleotide which hybridizes does not hybridize 
under stringent hybridization conditions to a polynucleotide having a nucleotide 

30 sequence consisting of only A residues or of only T residues. Polypeptides encoded 
by such nucleic acids may also be used according to the methods of the present 
invention. 
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Another aspect of the invention provides for use of isolated nucleic acid 
molecules comprising polynucleotides having nucleotide sequences selected from the 
group consisting of: (a) nucleotide sequences encoding human SA synthetase having 
the amino acid sequences as shown in SEQ ID NO:6; (b) nucleotide sequences 
5 encoding a biologically active fragment of the human SA synthetase polypeptide 
having the amino acid sequence shown in SEQ ID NO:6; (c) nucleotide sequences 
encoding an antigenic fragment of the human SA synthetase polypeptide having the 
amino acid sequence shown in SEQ ID NO:6; (d) nucleotide sequences encoding the 
human SA synthetase polypeptide comprising the complete amino acid sequence 

10 encoded by the plasmid contained in the ATCC Deposit; (e) nucleotide sequences 
encoding a biologically active fragment of the human SA synthetase polypeptide 
having the ammo acid sequence encoded by the plasmid contained in the ATCC 
Deposit; (f) a nucleotide sequence encoding an antigenic fragment of the human SA 
synthetase polypeptide having the amino acid sequence encoded by the plasmid 

15 contained in the ATCC Deposit; and (g) nucleotide sequences complementary to any 
of the nucleotide sequences in (a) through (f), above. Polypeptides encoded by such 
nucleic acids may also be used according to the methods of the present invention. 
Further embodiments of the invention include use of isolated nucleic acid molecules 
that comprise a polynucleotide having a nucleotide sequence at least 80%, 85%, or 

20 90% identical, and more preferably at least 95%, 97%, 98% or 99% identical, to any 
of the nucleotide sequences in (a), (b), (c), (d), (e), (f), or (g) above, or a 
polynucleotide which hybridizes under stringent hybridization conditions to a 
polynucleotide in (a), (b), (c), (d), (e), (f), or (g), above. This polynucleotide which 
hybridizes does not hybridize under stringent hybridization conditions to a 

25 polynucleotide having a nucleotide sequence consisting of only A residues or of only 
T residues. Polypeptides encoded by such nucleic acids may also be used according to 
the methods of the present invention. 

By a nucleic acid having a nucleotide sequence at least, for example, 95% 
"identical" to a reference nucleotide sequence of the present invention, it is intended 

30 that the nucleotide sequence of the nucleic acid is identical to the reference sequence 
except that the nucleotide sequence may include up to five point mutations per each 
100 nucleotides of the reference nucleotide sequence encoding the described 
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polypeptide. In other words, to obtain a nucleic acid having a nucleotide sequence at 
least 95% identical to a reference nucleotide sequence, up to 5% of the nucleotides in 
the reference sequence may be deleted or substituted with another nucleotide, or a 
number of nucleotides up to 5% of the total nucleotides in the reference sequence may 
5 be inserted into the reference sequence. The query sequence may be an entire 
sequence, such as, for example, that shown of SEQ ID NO:l, the ORF (open reading 
frame), or any fragment as described herein. 

As a practical matter, whether any particular nucleic acid molecule or 
polypeptide is at least, for example, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% 

10 identical to a nucleotide sequence of the presence invention can be determined 
conventionally using known computer programs. A preferred method for determining 
the best overall match between a query sequence (a sequence of the present invention) 
and a subject sequence, also referred to as a global sequence alignment, can be 
determined using the FASTDB computer program based on the algorithm of Brutlag 

15 et al. (Comp. App. Biosci. (1990) 6:237-245.) In a sequence alignment the query and 
subject sequences are both DNA sequences. An RNA sequence can be compared by 
- converting U's to T's. The result of said global sequence alignment is in percent 
identity. Preferred parameters used in a FASTDB alignment of DNA sequences to 
calculate percent identity are: Matrix=Unitary, k-tuple=4, Mismatch Penalty=l, 

20 Joining Penalty=30, Randomization Group Length=0, Cutoff Score=l, Gap 
Penalty=5, Gap Size Penalty 0.05, Window Size=500 or the length of the subject 
nucleotide sequence, whichever is shorter. 

If the subject sequence is shorter than the queiy sequence because of 5' or 3' 
deletions, not because of internal deletions, a manual correction must be made to the 

25 results. This is because the FASTDB program does not account for 5' and 3' 
truncations of the subject sequence when calculating percent identity. For subject 
sequences truncated at the 5' or 3' ends, relative to the query sequence, the percent 
identity is corrected by calculating the number of bases of the query sequence that are 
5' and 3' of the subject sequence, which are not matched/aligned, as a percent of the 

30 total bases of the query sequence. Whether a nucleotide is matched/aligned is 
determined by results of the FASTDB sequence alignment. This percentage is then 
subtracted from the percent identity, calculated by the above FASTDB program using 
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the specified parameters, to arrive at a final percent identity score. This corrected 
score is what is used for the purposes of the present invention. Only bases outside the 
5' and 3' bases of the subject sequence, as displayed by the FASTDB alignment, 
which are not matched/aligned with the query sequence, are calculated for the 
5 purposes of manually adjusting the percent identity score. 

For example, a 90 base subject sequence is aligned to a 100 base query 
1 sequence to determine percent identity. The deletions occur at the 5' end of the 
subject sequence and therefore, the FASTDB alignment does not show a 
matched/alignment of the first 10 bases at 5' end." The 10 unpaired bases represent 

10 10% of the sequence (number of bases at the 5' and 3' ends not matched/total number 
of bases in the query sequence) so 10% is subtracted from the percent identity score 
calculated by the FASTDB program. If the remaining 90 bases were perfectly 
matched the final percent identity would be 90%. In another example, a 90 base 
subject sequence is compared with a 100 base query sequence. This time the 

15 deletions are internal deletions so that there are no bases on the 5' or 3' of the subject 
sequence which are not matched/aligned with the query. In this case the percent 
identity calculated by FASTDB is not manually corrected. Once again, only bases 5' 
and 3' of the subject sequence which are not matched/aligned with the query sequence 
are manually corrected for. No other manual corrections are to made for the purposes 

20 of the present invention. 

By a polypeptide having an amino acid sequence at least, for example, 95% 
"identical" to a query amino acid sequence of the present invention, it is intended that 
the amino acid sequence of the subject polypeptide is identical to the query sequence 
except that the subject polypeptide sequence may include up to five amino acid 

25 alterations per each 100 amino acids of the query amino acid sequence. In other 

words, to obtain a polypeptide having an amino acid sequence at least 95% identical 
to a query amino acid sequence, up to 5% of the amino acid residues in the subject 
sequence may be inserted, deleted (indels) or substituted with another amino acid. 
These alterations of the reference sequence may occur at the amino or carboxy 

30 terminal positions of the reference amino acid sequence or anywhere between those 
terminal positions, interspersed either individually among residues in the reference 
sequence or in one or more contiguous groups within the reference sequence. 
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As a practical matter, whether any particular polypeptide is at least, for 
example, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% identical to, for example, 
the amino acid sequences of SEQ ID NO:2 or to the amino acid sequence encoded by 
the cDNA contained in a deposited clone can be determined conventionally using 
5 known computer programs. A preferred method for determining the best overall 
match between a query sequence (a sequence of the present invention) and a subject 
sequence, also referred to as a global sequence alignment, can be determined using 
the FASTDB computer program based on the algorithm of Brutlag et al. (Comp. App. 
Biosci. 6:237-245(1990)). In a sequence alignment the query and subject sequences 

10 are either both nucleotide sequences or both amino acid sequences. The result of said 
global sequence alignment is in percent identity. Preferred parameters used in a 
FASTDB amino acid alignment are: Matrix=PAM 0, k-tuple=2, Mismatch 
Penalty=l, Joining Penalty=20, Randomization Group LengthH), Cutoff Score=l, 
Window Size=sequence length, Gap Penalty=5, Gap Size Penalty=0.05, Window 

1 5 Size=500 or the length of the subject amino acid sequence, whichever is shorter. 

If the subject sequence is shorter than the query sequence due to N- or C- 
terminal deletions, not because of internal deletions, a manual correction must be 
made to the results. This is because the FASTDB program does not account for N- 
and C-terminal truncations of the subject sequence when calculating global percent 

20 identity. For subject sequences truncated at the N- and C-termini, relative to the 
query sequence, the percent identity is corrected by calculating the number of residues 
of the query sequence that are N- and C-terminal of the subject sequence, which are 
not matched/aligned with a corresponding subject residue, as a percent of the total 
bases of the query sequence. Whether a residue is matched/aligned is determined by 

25 results of the FASTDB sequence alignment. This percentage is then subtracted from 
the percent identity, calculated by the above FASTDB program using the specified 
parameters, to arrive at a final percent identity score. This final percent identity score 
is what is used for the purposes of the present invention. Only residues to the N- and 
C-termini of the subject sequence, which are not matched/aligned with the query 

30 sequence, are considered for the purposes of manually adjusting the percent identity 
score. That is, only query residue positions outside the farthest N- and C-terminal 
residues of the subject sequence. 
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For example, a 90 amino acid residue subject sequence is aligned with a 100 
residue query sequence to determine percent identity. The deletion occurs at the N- 
terminus of the subject sequence and therefore, the FASTDB alignment does not 
show a matching/alignment of the first 10 residues at the N-terminus. The 10 
5 unpaired residues represent 10% of the sequence (number of residues at the N- and C- 
termini not matched/total number of residues in the query sequence) so 10% is 
subtracted from the percent identity score calculated by the FASTDB program. If the 
remaining 90 residues were perfectly matched the final percent identity would be 
90%. In another example, a 90 residue subject sequence is compared with a 100 

10 residue query sequence. This time the deletions are internal deletions so there are no 
residues at the N- or C-termini of the subject sequence which are not matched/aligned 
with the query. In this case the percent identity calculated by FASTDB is not 
manually corrected. Once again, only residue positions outside the N- and C-terminal 
ends of the subject sequence, as displayed in the FASTDB alignment, which are not 

15 matched/aligned with the query sequence are manually corrected for. No other 
■ manual corrections are to made for the purposes of the present invention. 

In another embodiment of the invention, to determine the percent homology of 
two amino acid sequences, or of two nucleic acids, the sequences are aligned for 
optimal comparison purposes (e.g., gaps can be introduced in the sequence of one 

20 protein or nucleic acid for optimal alignment with the other protein or nucleic acid). 
The amino acid residues or nucleotides at corresponding amino acid positions or 
nucleotide positions are then compared. When a position in one sequence is occupied 
by the same amino acid residue or nucleotide as the corresponding position in the 
other sequence, then the molecules are homologous at that position. As used herein, 

25 amino acid or nucleic acid "homology" is equivalent to amino acid or nucleic acid 
"identity". The percent homology between the two sequences is a function of the 
number of identical positions shared by the sequences (i.e., per cent homology equals 
the number of identical positions/total number of positions times 100). 

Variants of above described sequences include a substantially homologous 

30 protein encoded by the same genetic locus in an organism, i.e., an allelic variant. 

Variants also encompass proteins derived from other genetic loci in an organism, but 
having substantial homology to the proteins of Figures 27-32, or otherwise described 
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herein. Variants also include proteins substantially homologous to the protein but 
derived from another organism, i.e., an ortholog. Variants also include proteins that 
are substantially homologous to the proteins that are produced by chemical synthesis. 
Variants also include proteins that are substantially homologous to the proteins that 
5 are produced by recombinant methods. As used herein, two proteins (or a region of 
the proteins) are substantially homologous when the amino acid sequences are at least 
about 55-60%, typically at least about 70-75%, more typically at least about 80-85%, 
and most typically at least about 90-95% or more homologous. A substantially 
homologous amino acid sequence, according to the present invention, will be encoded 

10 by a nucleic acid sequence hybridizing to the nucleic acid sequence, or portion 
thereof, of the sequence shown in Figures 27, 28, 3 1 or otherwise described herein 
under stringent conditions as more fully described below. 

Orthologs, homologs, and allelic variants that are encompassed by the 
invention and that may be used according to the methods of the invention can be 

15 identified using methods well known in the art. These variants comprise a nucleotide 
sequence encoding a protein that is at least about 55%, typically at least about 70- 
75%, more typically at least about 80-85%, and most typically at least about 90-95% 
or more homologous to the nucleotide sequence shown in Figures 27, 29, 31, or 
otherwise described herein, or a fragment of this sequence. Such nucleic acid 

20 molecules can readily be identified as being able to hybridize under stringent 

conditions, to the nucleotide sequence shown in Figures 27, 29, 31, or complementary 
sequence thereto, or otherwise described herein, or a fragment of the sequence. It is 
understood that stringent hybridization does not indicate substantial homology where 
it is due to general homology, such as poly A sequences, or sequences common to all 

25 or most proteins in an organism or class of proteins. 

The invention also encompasses polypeptides having a lower degree of 
identity but having sufficient similarity so as to perform one or more of the same 
functions performed by the enzyme polypeptides described herein. Similarity is 
determined by conserved amino acid substitution. Such substitutions are those that 

30 substitute a given amino acid in a polypeptide by another amino acid of like 
characteristics (see Table 1). Conservative substitutions are likely to be' 
phenotypically silent. Typically seen as conservative substitutions are the 
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replacements, one for another, among the aliphatic amino acids Ala, Val, Leu, and lie; 
interchange of the hydroxyl residues Ser and Thr, exchange of the acidic residues Asp 
and Glu, substitution between the amide residues Asn and Gin, exchange of the basic 
residues Lys and Arg and replacements among the aromatic residues Phe, Tyr. 
5 Guidance concerning which amino acid changes are likely to be phenotypically silent 
are found in Bowie et al, Science 247:1306-1310 (1990). 



TABLE 1. Conservative Amino Acid Substitutions. 



/vromaxic 


r nenyiaiamnc 




Tryptophan 




i yrosiiic 


Hydrophobic 


Leucine 




Isoleucine 




Valine 


Polar 


Glutamine 






Basic 


Arginine 




Lysine 




Histidine 


Acidic 


Aspartic Acid 




Glutamic Acid 


Small 


Alanine 




Serine 




Threonine 




Methionine 




Glycine 



Both identity and similarity can be readily calculated (Computational 
10 Molecular Biology, Lesk, A.M., ed., Oxford University Press, New York, 1988; 

Biocomputing: Informatics and Genome Projects, Smith, D.W., ed., Academic Press, 
New York, 1993; Computer Analysis of Sequence Data, Part 1, Griffin, A.M., and 
Griffin, H.G., eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular 
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Biology, von Heinje, G., Academic Press, 1987; and Sequence Analysis Primer, 
Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991). Preferred 
computer program methods to determine identify and similarity between two 
sequences include, but are not limited to, GCG program package (Devereux, J. (1984) 
5 Nuc. Acids Res. 12(1):3$7), BLASTP, BLASTN, FASTA (Atschul, S.F. (1990) J. 
Molec. Biol 275:403). 

A variant polypeptide can differ in amino acid sequence by one or more 
substitutions, deletions, insertions, inversions, fusions, and truncations or a 
combination of any of these. 

1 0 Variant polypeptides can be fully functional or can lack function in one or 

more activities. Thus, in the present case, variations can affect the function, for 
example, of one or more of the modules, domains, or functional subregions of the 
enzyme polypeptides of the invention. Preferably, polypeptide variants and fragments 
have the described activities routinely assayed via bioassays described herein or 

1 5 otherwise known in the art. 

Fully functional variants typically contain only conservative variation or 
variation in non-critical residues or in non-critical regions. Functional variants can 
also contain substitution of similar amino acids, which result in no change or an 
insignificant change in function. Alternatively, such substitutions may positively or 

20 negatively affect function to some degree. 

Non-functional variants typically contain one or more non-conservative amino 
acid substitutions, deletions, insertions, inversions, or truncation or a substitution, 
insertion, inversion, or deletion in a critical residue or critical region. As indicated, 
variants can be naturally-occurring or can be made by recombinant means or chemical 

25 synthesis to provide useful and novel characteristics for the polypeptide. 

Amino acids that are essential for function can be identified by methods 
known in the art, such as site-directed mutagenesis or alanine-scanning mutagenesis 
(Cunningham et al, Science 244:1081-1085 (1989)). The latter procedure introduces 
single alanine mutations at every residue in the molecule. The resulting mutant 

30 molecules are then tested for biological activity. Sites that are critical can also be 
determined by structural analysis such as crystallization, nuclear magnetic resonance 
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or photoaffmity labeling (Smith et al, J. Mol. Biol. 224:899-904 (1992); de Vos et al. 
Science 255:306-312 (1992)). 

The invention further encompasses variant polynucleotides, and fragments 
thereof, that differ from the nucleotide sequence, such as, for example, those shown in 
5 Figures 27, 29, 3 1 or otherwise described herein, due to degeneracy of the genetic 
code and thus encode the same protein as that encoded by the nucleotide sequence 
shown in Figures 27, 29, 3 1 or otherwise described herein. 

The invention also provides nucleic acid molecules encoding the variant 
polypeptides described herein. Such polynucleotides may be naturally occurring, 
10 such as allelic variants (same locus), homologs (different locus), and orthologs 
(different organism), or may be constructed by recombinant DNA methods or by 
chemical synthesis. Such non-naturally occurring variants may be made by 
mutagenesis techniques, including those applied to polynucleotides, cells, or 
organisms. Accordingly, as discussed above, the variants can contain nucleotide 
15 substitutions, deletions, inversions and insertions. 

Variation can occur in either or both the coding and non-coding regions. The 
variations can produce both conservative and non-conservative amino acid 
substitutions. 

"Polynucleotides" or "nucleic acids" that may be used according to the 
20 methods of the invention also include those polynucleotides capable of hybridizing, 
under stringent hybridization conditions, to sequences contained in SEQ ID NO:l, the 
complement thereof, or a cDNA within the deposited plasmids. As used herein, the 
term "hybridizes under stringent conditions" is intended to describe conditions for 
hybridization and washing under which nucleotide sequences encoding a receptor at 
25 least 55% homologous to each other typically remain hybridized to each other. The 
conditions can be such that sequences at least about 65%, at least about 70%, or at 
least about 75% or more homologous to each other typically remain hybridized to 
each other. Such stringent conditions are known to those skilled in the art and can be 
found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 
30 6.3.1-6.3.6. One example of stringent hybridization conditions are hybridization in 
6X sodium chloride/sodium citrate (SSC) at about 45degrees C, followed by one or 
more washes in 0.2 X SSC, 0.1% SDS at 50-65 degrees C. 
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Also contemplated for use according to the methods of the invention are 
nucleic acid molecules that hybridize to a polynucleotide disclosed herein under lower 
stringency hybridization conditions. Changes in the stringency of hybridization and 
signal detection are primarily accomplished through the manipulation of formamide 
5 concentration (lower percentages of formamide result in lowered stringency); salt 
conditions, or temperature. For example, lower stringency conditions include an 
overnight incubation at 37 degree C in a solution comprising 6X SSPE (20X SSPE = 
3M NaCl; 0.2M NaH 2 P0 4 ; 0.02M EDTA, pH 7.4), 0.5% SDS, 30% formamide, 100 
ug/ml salmon sperm blocking DNA; followed by washes at 50 degree C with 
10 1XSSPE, 0.1% SDS. In addition, to achieve even lower stringency, washes 
performed following stringent hybridization can be done at higher salt concentrations 
(e.g. 5X SSC). 

Note that variations in the above conditions may be accomplished through the 
inclusion and/or substitution of alternate blocking reagents used to suppress 

15 background in hybridization experiments. Typical blocking reagents include 
Denhardt's reagent, BLOTTO, heparin, denatured salmon sperm DNA, and 
commercially available proprietary formulations. The inclusion of specific blocking 
reagents may require modification of the hybridization conditions described above, 
due to problems with compatibility. 

20 Of course, a polynucleotide which hybridizes only to polyA+ sequences (such 

as any 3' terminal polyA+ tract of a cDNA shown in the sequence listing), or to a 
complementary stretch of T (or U) residues, would not be included in the definition of 
"polynucleotide," since such a polynucleotide would hybridize to any nucleic acid 
molecule containing a poly (A) stretch or the complement thereof (e.g., practically 

25 any double-stranded cDNA clone generated using oligo-dT as a primer). 

In one embodiment, an isolated nucleic acid molecule that hybridizes under 
stringent conditions to a sequence disclosed herein, or the complement thereof, such 
as, for example, the sequence of Figures 27, 29, 31, corresponds to a naturally- 
occurring nucleic acid molecule. As used herein, a "naturally-occurring" nucleic acid 

30 molecule refers to an RNA or DNA molecule having a nucleotide sequence that 
occurs in nature (e.g., encodes a natural protein). 
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The present invention also encompasses recombinant vectors, which include 
the isolated nucleic acid molecules and polynucleotides that may be used according to 
the methods of the present invention, and to host cells containing the recombinant 
vectors and/or nucleic acid molecules, as well as to methods of making such vectors 
5 and host cells and for using them for production of glycosylation enzyme by 
recombinant techniques. Polypeptides produced by such methods are also provided. 

The invention encompasses utilizing vectors for the maintenance (cloning 
vectors) or vectors for expression (expression vectors) of the desired polynucleotides 
encoding the carbohydrate processing of the invention, or those encoding proteins to 

10 be sialylated by the methods of the invention and/or by expression of the proteins the 
cells of the invention. The vectors can function in prokaryotic or eukaryotic cells or 
in both (shuttle vectors). 

In one embodiment, one or more of the polynucleotide sequences used 
according to the methods of the invention are inserted into commercially, publicly, or 

15 otherwise available baculovirus expression vectors for enhanced expression of the 
corresponding enzyme. In another non-exclusive embodiment, one ore more of the 
polynucleotides used according to the methods of the invention are inserted into other 
viral vectors or for generation of stable insect cell lines. Techniques known in the art, 
such as, for example, HPAEC and HPLC techniques, may be routinely used to 

20 evaluate the enzymatic activity of these enzymes from both eukaryotic and bacterial 
sources to determine which source is best for generating SA in insect cells. 

Generally, expression vectors contain cis-acting regulatory regions that are 
operably linked in the vector to the polynucleotide to be expressed, or other relevant 
polynucleotides such that transcription of the polynucleotides is allowed in a host cell. 

25 The polynucleotides can be introduced into the host cell with a separate 

polynucleotide capable of affecting transcription. Thus, the second polynucleotide 
may provide a trans-acting factor interacting with the cis-regulatory control region to 
allow transcription of the polynucleotides from the vector. Alternatively, a trans- 
acting factor may be supplied by the host cell. Finally, a trans-acting factor can be 

3 0 produced from the vector itself. 

It is understood, however, that in some embodiments, transcription of the 
polynucleotides can occur in a cell-free system. 



WO 00/52135 PCT/US00/05313 

54 

The regulatory sequence to which the polynucleotides described herein can be 
operably linked include, for example, promoters for directing mRNA transcription. 
These promoters include, but are not limited to, baculovirus promoters including, but 
not limited to, 1E0, 1E1, 1E2, 39k, 35k, egt, ME53, ORF 142, PE38, p6.9, capsid, 
5 gp64 polyhedrin, plO, basic and core; and insect cell promoters including, but not 
limited to, Drosophila actin, metallothionine, and the like. Where the host cell is not 
an insect cell, such promoters include, but are not limited to, the left promoter from 
bacteriophage lambda, the lac, TRP, and TAC promoters from E. coli, promoters from 
Actinomycetes, including Nocardia, and Streptomyces. 

1 0 Promoters may be isolated, if they have not already been isolated, by standard 

promoter identification and trapping methods known in the art, see, for example, in 
Sambrook et al. , Molecular Clotting: A Laboratory Manual. 2nd. ed. , Cold Spring 
Harbor Laboratory Press, Cold Spring Harbor, NY, (1989). 

It would be understood by a person of ordinary skill in the art that the choice 

15 of promoter would depend upon the choice of host cell. Similarly, the choice of host 
cell will depend upon the use of the host cell. Accordingly, host cells can be used for 
simply amplifying, but not expressing, the nucleic acid. However, host cells can also 
be used to produce desirable amounts of the desired polypeptide. In this embodiment, 
the host cell is simply used to express the protein per se. For example, amounts of the 

20 protein could be produced that enable its purification and subsequent use, for 

example, in a cell free system. In this case, the promoter is compatible with the host 
cell. Host cells can be chosen from virtually any of the known host cells that are 
manipulated by the methods of the invention to produce the desired glycosylation 
patterns. These could include mammalian, bacterial, yeast, filamentous fungi, or plant 

25 cells. 

In addition to control regions that promote transcription, expression vectors 
may also include regions that modulate transcription, such as repressor binding sites 
and enhancers. 

In addition to containing sites for transcription initiation and control, 
30 expression vectors can also contain sequences necessary for transcription termination 
and, in the transcribed region a ribosome binding site for translation. Other regulatory 
control elements for expression include initiation and termination codons as well as 
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polyadenylation signals. The person of ordinary skill in the art would be aware of the 
numerous regulatory sequences that are useful in expression vectors. Such regulatory 
sequences are described, for example, in Sambrook et al, cited above. 

Depending on the choice of a host cell, a variety of expression vectors can be 
5 used to express the polynucleotide. Such vectors include chromosomal, episomal, and 
particularly virus-derived vectors, for example, AoMNPV, OpMNPV, BmNPV, 
HzMNPV, and RoMNPV. Vectors may also be derived from combinations of these 
sources such as those derived from plasmid and bacteriophage genetic elements, e.g. 
cosmids and phagemids. Appropriate cloning and expression vectors for prokaryotic 

1 0 and eukaryotic hosts are described in Sambrook et al. , Molecular Cloning: A 

Laboratory Manual. 2nd. ed , Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, (1989). 

The regulatory sequence may provide constitutive expression in one or more 
host cells or may provide for inducible expression in one or more cell types such as by 

1 5 temperature, nutrient additive, or exogenous factor such as a hormone or other ligand. 
A variety of vectors providing for constitutive and inducible expression in prokaryotic 
and eukaryotic hosts are well known to those of ordinary skill in the art. 

The polynucleotides can be inserted into the vector nucleic acid using 
techniques known in the art. Generally, the DNA sequence that will ultimately be 

20 expressed is joined to an expression vector by cleaving the DNA sequence and the 
expression vector with one or more restriction enzymes and then ligating the 
fragments together. Procedures for restriction enzyme digestion and ligation are well 
known to those of ordinary skill in the art. 

Specific expression vectors are described herein for the purposes of the 

25 invention; for example, AcMNPV. Other expression vectors listed herein are not 
intended to be limiting, and are merely provided by way of example. The person of 
ordinary skill in the art would be aware of other vectors suitable for maintenance, 
propagation, or expression of the polynucleotides described herein. These are found 
for example in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A 

30 Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, NY, 1989. Any cell type or expression system 
can be used for the purposes of the invention including but not limited to, for 
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example, baculovirus systems (O'Riley etal. (1992) Baculovirus Expression Vectors, 

W.H. Freeman and Company, New York 1992) and Drosophila-derived systems 

(Johansen et al. (1989) Genes Dev 3^:882-889). 

The invention also encompasses vectors in which the nucleic acid sequences 
5 described herein are cloned into the vector in reverse orientation, but operably linked 

to a regulatory sequence that permits transcription of antisense RNA. Thus, an 

antisense transcript can be produced to all, or to a portion, of the polynucleotide 

sequences described herein, including both coding and non-coding regions. 

Expression of this antisense RNA is subject to each of the parameters described above 
10 in relation to expression of the sense RNA (regulatory sequences, constitutive or 

inducible expression, tissue-specific expression). 

The recombinant host cells are prepared by introducing the vector constructs 

described herein into the cells by techniques readily available to the person of 

ordinary skill in the art. These include, but are not limited to, calcium phosphate 
15 transfection, DEAE-dextran-mediated transfection, cationic lipid-mediated 

transfection, electroporation, transduction, infection, lipofection, and other techniques 

such as those found in Sambrook, et al. {Molecular Cloning: A Laboratory Manual. 

2nd, ed, Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratoiy Press, Cold 

Spring Harbor, NY, 1989). 
20 Where secretion of the polypeptide is desired, appropriate secretion signals 

known in the art are incorporated into the vector using techniques known in the art. 

The signal sequence can be endogenous to the polypeptides or heterologous to these 

polypeptides. 

Where the polypeptide is not secreted into the medium, the desired protein can 
25 be isolated from the host cell by techniques known in the art, such as, for example, 
standard disruption procedures, including freeze thaw, sonication, mechanical 
disruption, use of lysing agents and the like. The polypeptide can then be recovered 
and purified by well-known purification methods including, but not limited to, 
ammonium sulfate precipitation, acid extraction, anion or cationic exchange 
30 chromatography, phosphocellulose chromatography, hydrophobic-interaction 

chromatography, affinity chromatography, hydroxylapatite chromatography, lectin 
chromatography, and high performance liquid chromatography. 
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Furthermore, for suppressing activity of endogenous N-acetylglucosaminidase, 
the invention encompasses utilizing the sequences deduced from the fragment 
identified in Figure 18, and described in Example 4. More particularly, in this aspect, 
the invention comprises utilization of the glucosaminidase nucleotide sequences 
5 which are produced by using primers, such as, for example, those primer 

combinations described in Example 4. These nucleotide sequences may be used in 
the construction and expression of anti-sense RNA, ribozymes, or homologous 
recombination (gene "knock-out") constructs, using methods readily available to those 
skilled in the art, to reduce or eliminate in vivo glucosaminidase activity. 

1 0 Cell lines produced by the methods of the invention can be tested by 

expressing a model recombinant glycoprotein in such cell lines and assessing the N- 
glycans attached therein using techniques described herein or otherwise known in the 
art. The assessment can be done, for example, by 3-dimensional HPLC techniques. In 
the Examples of the invention, human transferrin is used as a model target 

1 5 glycoprotein, since this glycoprotein is sialylated in humans and extensive 

oligosaccharide structural information for the protein is available (Montreuil et al. 
(1997) Glycoproteins II Ed. 203-242). In this manner, cell lines with superior 
processing characteristics are identified. Such a cell line can then be evaluated for its 
growth rate, product yields, and capacity to grow in suspension culture (Lindsay et al. 

20 (1 992) Biotech, and Bioeng. 39:614-61 8, Reuveny et al. (1 992) Ann. NY Acad. Sci. 
665:320, Reuveny et al. (1993) Appl. Microbiol. Biotechnol. 38:619-623, Reuveny et 
al. (1993) Biotechnol. Bioeng. 42:235-239). 

The invention encompasses expressing heterologous proteins in the cells of the 
invention and/or according to the methods of the invention for any purpose benefiting 

25 from such expression. Such a purpose includes, but is not limited to, increasing the in 
vivo circulatory half life of a protein; producing a desired quantity of the protein; 
increasing the biological function of the protein including, but not limited to, enzyme 
activity, receptor activity, binding capacity, antigenicity, therapeutic property, 
capacity as a vaccine or a diagnostic tool, and the like. Such proteins may be 

30 naturally occurring chemically synthesized or recombinant proteins. Examples of 
proteins that benefit from the heterologous expression of the invention include, but 
are not limited to, transferrin, plasminogen, Na + , K + -ATPase , thyrotropin, tissue 
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plasminogen activator, erythropoietin, interleukins, and interferons. Other examples 
of such proteins include, but are not limited to, those described in International patent 
application publication number WO 98/06835, the contents of which are herein 
incorporated by reference. 
5 In one embodiment, proteins that benefit from the heterologous expression of 

the invention are mammalian proteins. In this aspect, mammals include but are not 
limited to, cats, dogs, rats, mice, cows, pigs, non-human primates, and humans. 

It is recognized that the heterologous expression of the invention not only 
encompasses proteins that are sialylated in their native source; but also those that are 

1 0 not sialylated as such, and benefit from the expression in the cells of and/or according 
to the methods of the invention. 

It is recognized that proteins that are not sialylated in their native source, can 
be altered by known genetic engineering methods so that the heterologous expression 
of the protein according to the invention will result in sialylation of the protein. Such 

1 5 methods include, but are not limited to, the genetic engineering methods described 
herein. In this aspect, it is further recognized that altering the proteins could 
encompass engineering into the protein targeting signals to ensure targeting of the 
proteins to the ER and Golgi apparatus for sialylation, where such signals are needed. 
It is also recognized that the cells of the invention contain proteins, which are 

20 not sialylated prior to manipulation of the cells according to the methods of the 
invention, but are sialylated subsequent to the manipulation. In this manner, the 
invention also encompasses proteins that have amino acid sequences that are 
endogenous to the cells of the invention, but are sialylated as a result manipulation of 
the cells according to the methods of the invention. 

25 It is recognized that the analysis of the N-glycans produced according to the 

methods of the invention may suggest additional strategies to further enhance the 
sialylation of glycoproteins in insect cells. If the production of Gal containing 
carbohydrate acceptor structures is low relative to those containing GlcNAc, then the 
levels of Gal transferase expression are increased by integrating multiple copies of 

30 this gene into the insect cell genome or by expressing Gal T under a stronger 
promoter using techniques described herein or otherwise known in the art. 
Additionally, or alternatively, substrate feeding strategies are used to enhance the 
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levels of UDP-Gal for this carbohydrate processing reaction. In contrast, if the 
fraction of carbohydrate structures terminating in Gal is high and the fraction with 
terminal SA is low, then sialyltransferase or CMP-SA production is enhanced. 
Examination of sialyltransferase activity using techniques described herein or 
5 otherwise known in the art, such as, for example, FRET or HPLC and CMP-SA levels 
using HPAEC, is used to determine which step is the metabolic limiting step to 
sialylation. These metabolic limitations are overcome by increasing expression of 
specific enzymes or by altering substrate feeding strategies or a combination thereof. 

10 ASSAYS 

Having generally described the invention, the same will be more readily 
understood by reference to the following assays and examples, which are provided by 
way of illustration and are not intended as limiting. 

Analytical bioassays are implemented to evaluate enzymatic activities in the 

1 5 N-glycosylation pathway of insect cells. In order to screen a larger selection of insect 
cells for particular oligosaccharide processing enzymes, bioassays in which multiple 
samples can be analyzed simultaneously are advantageous. Consequently, bioassays 
based on fluorescence energy transfer (FRET) and time-resolved fluorometry of 
europium (Eu) are designed to screen native and recombinant insect cell lines for 

20 carbohydrate processing enzymes in a format that can handle multiple samples. 

Fluorescence assays are especially useful in detecting limiting steps in 
carbohydrate processing due to their sensitivity and specificity. FRET and Eu assays 
detect enzymatic activities at levels as low as 10" 14 M, which is greater than the 
sensitivity obtained with 125 I. In addition, the use of substrates modified with 

25 fluorophores enables the measurement of one specific enzyme activity in an insect 
cell lysate, and multiple samples can be analyzed simultaneously in a microtiter plate 
configuration used in an appropriate fluorometer. With these assays, insect cell lines 
are rapidly screened for the presence of processing enzymes including Gal, GlcNAc, 
and sialic acid transferases to identify limiting enzymes in N-glycosylation in native 

30 and recombinant cells. 
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Fluorescence energy transfer (FRET) assays 

Glycosyl transferase activity assays are based on the principle of fluorescence 
energy transfer (FRET), which has been used to study glycopeptide conformation 
(Rice et al (1991) Biochemistry 30:6646-6655) and to develop endo-type glycosidase 
5 assays (Lee et al. (1995) Anal. Biochem. 230:31-36). 



Gal T assay 

The fluorescent compound, UDP-Gal-6-Naph, synthesized by consecutive 
reactions of galactose oxidase (generating 6-oxo compound) and reductive animation 

1 0 with naphthylamine, is found to be effective as a substrate for Gal transferase. When 
UDP-Gal-6-Naph is reacted with an acceptor carrying a dansyl group (Dans-AE- 
GlcNAc) in the presence of Gal-T, a product is created that can transfer energy 
(Figure 12). While irradiation of the naphthyl group in UDP-Gal-6-Naph at 260-290 
nm ("ex" in Figure 13) results in the usual emission at 320-370 nm ("em" dotted line 

15 in Figure 1 3), irradiation of the product at these same low wavelengths results in 
■ energy transfer to the dansyl group and emission at 500-560 nm ("em" solid line in 
Figure 13). Assay sensitivity is as great as the fluorometer allows (pico- to femtomol 
range) and exceeds that of radioisotopes. In addition, multiple samples can be 
monitored simultaneously in the fluorometer, allowing a number of cell lines to be 

20 evaluated rapidly for Gal T activity. 



Sialyltransferase assay 

A sialyltransferase assay is designed using similar FRET technology described 
in the above example for Gal T. The 3-carbon tail (exocyclic chain) of sialic acid (in 

25 particular, its glycoside) can be readily oxidized with mild periodate to yield an 
aldehyde (Figure 14). This intermediate is reductively animated to generate a 
fluorescently tagged sialic acid (after removal of its aglycon), which is then modified 
to form a fluorescently modified CMP-sialic acid (See also Lee et al. (1994) Anal. 
Biochem. 216:358-364, Brossamer et al. (1994) Methods Enzymol. 247:153-177). The 

30 acceptor substrate is modified as described above to include the dansyl group. Then 
the FRET approach is used to measure either alpha(2, 3) or alpha(2, 6) 
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sialyltransferase activity since these enzymes should utilize the modified CMP-SA as 
donor substrate to generate a product with altered fluorescent emission characteristics. 

The choice of the fluorescent donor and acceptor pair can be flexible. The 
above examples are given using naphthyl-dansyl pairs, but other fluorescent 
5 combinations may be even more sensitive (Wu et al. (1994) Anal. Biochem. 250:260- 
262). 

Europium (Eu') fluorescence assays. 

An example of the use of Eu +3 fluorescence for the evaluation of Gal T 
1 0 activity is provided herein in the N-linked oligosaccharides from insect cells. The 

same techniques are used to develop enzymatic assay for transferases such as GlcNAc 
Tl and glycosidases such as N-acetylglucosaminidase. Further enhancements in 
sensitivity are obtained with the advent of the super-sensitive Eu-chelator, BHHT (4, 
4'- bis (1", 1", I", 2", 2", 3", 3 , -heptatluro-4 ,, ) 6"-hexanedione-6'-yl)-chlorosulfo-o~ 
15 terphenyl) (Yuan et al. (1998) Anal. Chem. 70:596-601), which allows detection 
down to the lower fmol range. 

GlcNac-TI Assay 

A new GlcNAc-TI assay, illustrated in Figure 15, utilizes a synthetic 6- 
20 aminohexyl glycoside of the trimannosyl N-glycan core structure labeled with DTPA 
(Diethylenetriaminepentaacetic acid) and complexed with Eu +3 . This substrate is then 
incubated with insect cell ly sates or positive controls containing GlcNAc Tl and 
UDP-GlcNAc. Addition of chemical inhibitors are used to minimize background N- 
acetylglucosaminidase activity. After the reaction, an excess of Crocus lectin CVL 
25 (Misaki et al. (1 997) J. Biol. Chem. 272:25455-25461), which specifically binds the 
trimannosyl core, is added. The amount of the lectin required to bind all the 
trimannosyl glycoside (and hence all the Eu +3 label) in the absence of any GlcNAc 
binding is predetermined. The reacted mixture is then filtered through a 10,000 
molecular weight cut off (MWCO) microfuge ultrafiltration cup. The glycoside 
30 modified with GlcNAc does not bind CVL and appears in the filtrate. Measurement of 
the Eu +3 fluorescence in the filtrate reflects the level of GlcNAc Tl activity in the 
culture lysates. 
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N-acetylglucosaminidase assay 

An assay for N-acetylglucosaminidase activity is developed using a different 
lectin, GS-II, which is specific for GlcNAc. The substrate is prepared by modification 
5 of the same trimannosyl core glycoside described above using in vitro purified 
GlcNAc Tl, which results in addition of a GlcNAc_6eto(l -2) residue to the 
Man_alpha(l-3) residue. Following incubation with insect cell lysates, enzymatic 
hydrolysis by N-acetylglucosaminidase removes GlcNAc from the substrate resulting 
in the tri-mannosyl core product. The product is not susceptible to lectin binding and 
10 thus escapes into the filtrate. Evaluation of Eu +3 fluorescence in the filtrate provides 
a measure of the N-acetylglucosaminidase activity. Alternatively, enhanced binding 
of the Eu-bound trimannosyl core to the Crocus lectin described above can be used as 
another assay for N-acetylglucosaminidase activity. 

15 Characterization ofN-linked Oligosaccharides from Insect Cells 

Carbohydrate structure elucidation of the N-glycans of a recombinant 
glycoprotein, IgG, purified from Trichoplusia ni (High Five™ cells; Invitrogen Corp., 
Carlsbad, CA, USA) has been undertaken (Davis et al. (1993) In Vitro Cell. Dev. Biol 
29:842-846; Hsu era/. (1997) J. Biol. Chem. 272:9062-9070). The recombinant 

20 glycoprotein, immunoglobulin G (IgG), was purified from both intracellular and 

extracellular (secreted) sources and all the attached N-glycans determined using three 
dimensional HPLC techniques. The composition of these structures provided insights 
into the carbohydrate processing pathways present in insect cells and allowed a 
comparison of intracellular and secreted N-glycan structures. 

25 The Trichoplusia ni cells grown in serum free medium in suspension culture 

were infected with a baculovirus vector encoding a murine IgG (Summers et al. 
(1987) A manual of methods for baculovirus vectors and insect cells culture 
procedures). IgG includes an N-linked oligosaccharide attachment on each of the two 
heavy chains. 

30 Heterologous IgG was purified from the culture supernatant and soluble cell 

lysates using a Protein A-Sepharose column. N-linked oligosaccharides were isolated 
following protease digestion of IgG and treatment with glycoamidase A to release the 



WO 00/52135 



63 



PCT/US00/05313 



N-glycans. Oligosaccharides were then derivatized with 2-aminopyridine (PA) at the 
reducing ends to provide fluorogenic properties for detection. 

Three-dimensional HPLC analysis, was performed to elucidate the N-linked 
oligosaccharide structures attached to the heavy chain of IgG (Tomiya et al. (1988) 
5 Anal. Biochem. 171 :73-90, Takahashi et al. (1992) Handbook of Endoglycosidases 
and Glycoamidases Ed. 199-332). This technique separates oligosaccharides by three 
successive HPLC steps and enables the identification of structures by comparison of 
elution conditions with those of known standards. 

A DEAE column was used to separate oligosaccharides on the basis of 

10 carbohydrate acidity (first dimension). None of the oligosaccharides retained on this 
column were found to include sialic acid. Treatment of the acidic fractions with 
neuraminidase from Arthrobacter ureafaciens (known to cleave all known sialic acid 
linkages) failed to release any sialic acid, and ODS-chromatography of the fractions 
revealed several minor components different from all known sialylated 

15 oligosaccharides. 

The second dimension used reverse phase HPLC with an ODS-silica column 
to fractionate the labeled oligosaccharides according to carbohydrate structure. 
Supernatant (S) and lysate (L) IgGs oligosaccharides were separated into 6 and 10 
fractions, respectively, labeled A-L in Figure 6. 

20 Separation in the third and final dimension was accomplished using an amide 

column to isolate oligosaccharides on the basis of molecular size. Peak B from the 
ODS column was separated into two separate oligosaccharide fractions, and peak H 
was separated into three separate oligosaccharide fractions on the amide-column. 
After oligosaccharide purification, structures of unknown oligosaccharides 

25 were determined by comparing their positions on the 3-dimensional map with the 
positions of over 450 known oligosaccharides. Co-elution of an unknown sample 
with a known PA-oligosaccharide on the ODS and amide-silica columns was used to 
confirm the identity of an oligosaccharide. Digestion by glycosidases with specific 
cleavage sites (aZ/j/ja-L-fucosidase, fteto-galactosidase, beta-N- 

30 acetylglucosaminidase, and tf//?Aa-mannosidase) followed by reseparation provided 
further confirmation. 
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All the oligosaccharides in the culture medium and cell lysates matched 
known carbohydrates except for oligosaccharide G. The structure of oligosaccharide 
G was elucidated by treatment of the N-glycan with a/p/za-L-fucosidase, known to 
digest Fuc_a/p/zal-6GIcNAc, followed by treatment with 13.5 M trifluoroacetic acid 
5 to remove the alphdl, 3 linked fucose. The de-alpha\, 6- and de-alphal, 3-fucosylated 
oligosaccharide G co-eluted with a known oligosaccharide, allowing the identification 
of G. The structure of oligosaccharide G is shown in Figure 7. 

The structure of oligosaccharide G was further confirmed by 'H-NMR and 
electrospray ionization (ESI) mass spectrometry (Hsu et al. (1997) J. Biol. Chem. 

10 2 72:9062-9070). Thus, the combination of these techniques can be used to elucidate 
both known and unknown oligosaccharides. 

The carbohydrates attached to IgG from the culture medium and intracellular 
lysate were identified and the levels present in each source were quantified. These 
structures were then used in conjunction with previous studies of oligosaccharide 

1 5 processing in insect cells (Altmann et al. (1 996) Trends in Glycoscience and 

Glycotechnology 8: 101-1 14) to generate a detailed map of N-linked oligosaccharide 
processing in Trichoplitsia ni insect cells. The pathway and the levels of the 
oligosaccharides from secreted and intracellular sources are detailed in Figure 8. 

The initial processing in the T. ni cells appears to be similar to the mammalian 

20 pathway, including trimming of the terminal glucose and mannose residues. The 

trimming process follows a linear pathway with the exception of two different forms 
of the Man 7 GlcNAc2 (M7GN, in Figure 8 also observed in native insect glycoproteins 
(Altmann et al. (1996) Trends in Glycoscience and Glycotechnology 8:101-1 14) and 
IgG 4 , fromNS/0 cells (Ip et al. (1994) Arch. Biochem. Biophys. 308:387-399). The 

25 presence of these two Man? forms suggests the possible existence of an alternative 
processing pathway that yields Man7GlcNAc2 through the action of endo-alpha- 
mannosidase. Following cleavage of the mannose residues, GlcNAc (GN) is added to 
the alphal,3 branch of Man 5 GlcNAC 2 by GlcNAc TI (N- 

acetylglusosaminyltransferase I) (Altmann et al. (1996) Trends in Glycoscience and 
30 Glycotechnology 8:101-1 14). However, GlcNAci MansGlcNACi must be a short- 
lived intermediate quickly processed by alpha-Man II, since this structure was not 
detected in the T. ni cell lysate. At the GlcNAci, Mati3 GlcNAc2 oligosaccharide, 
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several branching steps in the N-glycan processing pathway are possible in insect 
cells. Complex glycoforms can be generated by the action of GlcNAc Til (N- 
acetylglucosaminyltransferase II) and Gal T (galactosyltransferase T) to provide 
oligosaccharides which include terminal GlcNAc (GN) and Gal (G) residues. None 
5 of the complex oligosaccharide structures included sialic acid indicating that 
sialylation is negligible or non-existent in these cells. 

The production of these complex glycoforms must compete with an alternative 
processing pathway that is catalyzed by N-acetylglucosaminidase (Altmann et al. 
(1995) J. Biol. Chem. 270:17344-17349) (see Branch Points in Figure 8), leading to 

1 0 the production of hybrid and paucimannosidic structures. While the complex-type N- 
glycans represent 35% of the total secreted glycoforms (supernatant % in Figure 8), 
the majority of secreted N-glycans are either paucimannosidic (35%) or hybrid 
structures (30%). Furthermore, those complex structures with a branch terminating in 
Gal represent less than 20% of the total secreted glycoforms and no structures were 

1 5 observed with terminal Gal on both branches of the N-glycan. 

In contrast to the secreted glycoforms, the intracellular N-glycans (lysate % in 
Figure 8) obtained from insect cells include more than 50% high-mannose type 
structures. The fraction of intracellular complex oligosaccharides is less than 1 5% and 
only 8% include a terminal Gal residue. The high level of high-mannose structures 

20 from intracellular sources indicates significantly less oligosaccharide processing for 
most of the intracellular immunoglobulins. Many of these intracellular 
immunoglobulins may not reach the compartments in which carbohydrate trimming 
takes place (Jarvis et al. (1989) Mol. Cell. Biol. 9:214-223). High mannose 
glycoforms are also observed intracellularly for mammalian cells (Jenkins et al. 

25 (1 998) Cell Culture Engineering VI). 

Examples 

Example 1: Evaluation of N-glycosylation Pathway Enzymes 
30 The levels of N-linked oligosaccharide processing enzymes are measured 

using analytical assays to characterize carbohydrate processing in native and 
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recombinant insect cells. These assays are used to compare the N-glycan processing 
capacity of different cell lines and to evaluate changes in processing and metabolite 
levels following metabolic engineering modifications. 

High Performance Anion Exchange Chromatography (HPAEC) assay for galactose 
5 transferase 

HPAEC is used in combination with pulsed amperometric detection (HPAEC- 
PAD) or conductivity to detect metabolite levels in the CMP-SA pathway and to 
evaluate N-linked oligosaccharide processing enzymes essentially as described by 
(Lee etal. (1990) Anal. Biochem. 34:953-957, Lee et al. (1996) J. Chromatography A 
10 720:137-149). Shown in Figure 9 is an example of the use of HPAEC-PAD for 
measuring Gal T activity by following the lactose formation reaction: 

UDP - Gal + Glc GalT^ Lac + UDP 

1 5 The peak labeled "Lac" indicates the formation of the product lactose (Lac). 

Many of the enzymes involved in N-glycosylation (e.g., aldolase, CMP-NeuAc 
synthetase, sialyltransferase) and metabolic intermediates (e.g., sialic acid, CMP- 
sialic acid, MariNAc, ManNAc-6-phosphate) in the CMP-SA production pathway are 
measured using this form of chromatography, essentially as described by Lee et al. 

20 (1990) Anal. Biochem. 34:953-957, Lee etal. (1996) J. Chromatography A 720:137- 
149, Hardy etal. (l9SS)Anal. Biochem. 170:54-62, Townsend etal. (1988) Anal. 
Biochem. 174:459-470, Kiang et al. (1997) Anal. Biochem. 245:97-101. 

Reverse phase High Performance Liquid Chromatography (HPLC) for 

25 sialyltransferase 

To detect native sialyltransferase enzyme activity, Trichoplusia ni lysates were 
incubated in the presence of exogenously added CMP-SA and the fluorescent 
substrate, 4-methylumbelliferyl lactoside (Lac-MU). Negligible conversion of the 
substrate was observed, indicating the absence of endogenous sialyltransferase 

30 activity. However, following infection of the insect cells with a baculovirus encoding 
human a/p/?a2-3-sialyltransferase, conversion of Lac-MU to the product sialyl 
LacMU was observed in cell lysates using Reverse Phase HPLC and a fluorescence 



WO 00/52135 



67 



PCT/US00/05313 



detector (Figure 10). For higher sensitivity, Lac-PA (PA=2-aminopyridine) or Lac- 
ABA (ABA=o-aminobenzamide) are used as substrates. HPLC and HPAEC is used 
in conjunction with other fluorometric methods detailed in the procedures to analyze 
the metabolites and enzymatic activities in insect cells. 

5 

Dissociation Enhanced Lanthananide FluorommunoAssay (DELFIA) for GalT 

The previous chromatography techniques have one limitation in that only one 
sample can be handled at a time. When samples from several cell lines must be 
assayed, a method such as DELFIA is advantageous since a multiwell fluorometer can 

10 simultaneously examine activities in many samples on a microtiter plate (Hemmila et 
al. (\9t>4)Anal. Biochem. 137:335-343). The application of such a technique for the 
measurement of Gal T activity in several different insect cell lysates and controls is 
shown in Figure 1 1 . First, the wells of the microtiter plate are coated with the 
substrate GlcNAc-BSA (Stowell et al (1993) Meth. in Carb. Chem. 9:178-181). 

1 5 After incubation with Gal T and UDP-Gal, the well is washed and the Gal residue 
newly attached to GlcNAc-BSA is measured with europium (Eu +3 )-labeled Ricinus 
cummunis lectin, which specifically binds Gal or GalNAc structures. The sensitivity 
of Eu fluorescence under appropriate conditions can reach the fmol range and match 
or eclipse that of radioiodides (Kawasaki et al. (1997) Anal. Biochem. 250:260-262). 

20 Figure 1 1 depicts GlcNAc-BSA in (A) Boiled lysate; (B) T. ni; (C) Standard 

enzyme, 0.5 mU; (D) T. ni insect cells infected with a baculovirus coding for GalT 
(E) Sf-9 cells stably transfected with GalT gene. The increase in Gal T activity in 
untreated cell lysates (B in Figure 11) relative to boiled lysates (A) indicates that T. ni 
cells have low but measurable endogenous Gal T activity. The Gal T activity level is 

25 increased significantly following infection with a baculovirus vector including a 
mammalian Gal T gene under the IE1 promoter or by using Sf-9 cells stably- 
transformed with the Gal T gene (cell lines are described in Jarvis et al. (1996) Nature 
Biotech. 14:1288-1292; and Hollister et al. (1998) Glycobiology 5:473-480). 

The DELFIA method is not limited to Gal T measurement. This technique is 

30 used to evaluate the activity of any processing enzyme which generates carbohydrate 
structures containing binding sites for a specific lectin or carbohydrate-specific 
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antibodies (Taki et al. (1994) Anal. Biochem. 219:104-108, Rabina et al. (1997) Anal. 
Biochem. 246:459-470). 

Example 2: Enhancing SA levels by Substrate Addition 
5 Because the conventional substrates in insect cell media are not efficiently 

converted to CMP-SA in insect cells as demonstrated by the low levels of CMP-SA, 
alternative substrates are added to the culture medium. Because sialic acid and CMP- 
SA are not permeable to cell membranes (Bennetts et al. (1981) J. Cell. Biol. 88:1- 
15), they are not considered as appropriate substrates. However, other precursors in 
10 the CMP-SA pathway are incorporated into cells and considered as substrates for the 
generation of CMP-SA in bisect cells. 

Incorporation and conversion ofN-acetylmannosamine (ManNAc) 

ManNAc has been added to mammalian tissue and cell cultures and 

15 enzymatically converted to SA and CMP-SA (Ferwerda et al. (1983) Biochem. J. 
216:87-92, Gu et al. (1997) Improvement of the interferon-gamma sialylation in 
Chinese hamster ovary cell culture by feeding N-acetylmannosamine, Thomas et al. 
(1985) Biochim. Biophys. Acta 846:37-43). Consequently, external feeding of 
ManNAc is examined as one strategy to enhance CMP-SA levels in insect cells. 

20 ManNAc is available commercially (Sigma Chemical Co.) or can be prepared 
chemically from the less expensive feedstock GlcNAc in vitro using sodium 
hydroxide (Mahmoudian et al. (1997) Enzyme and Microbial Technology 20:393- 
400). Initially, the levels of native cellular ManNAc, if any, is determined using 
HPAEC-PAD techniques (Lee etal. (1990) Anal. Biochem. 34:953-957, Lee et al. 

25 (1996) J. Chromatography A 720:137-149, Hardy et al. (1988) Anal. Biochem. 

170:54-62, Townsend et al. (1988) Anal. Biochem. 174:459-470, Kiang et al. (1997) 
Anal. Biochem. 245:97-101). The ability to increase intracellular ManNAc levels is 
evaluated by adding ManNAc to cell culture media. Incorporation of exogenous 
ManNAc is quantified using unlabeled ManNAc if levels of native ManNAc are 

30 negligible, or 14 C- or 3 H-labeled ManNAc if significant levels of native ManNAc are 
present) (Bennetts etal. (1981) J. Cell Biol. 88:1-15, Kriesel et al. (1988) J. Biol. 
Chem. 263 : 1 1736-1 1742). The levels of radioactive ManNAc and other metabolites 
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are determined by collecting ManNAc peaks following HPAEC and measuring the 
radioactivity using scintillation countering. 

To be effective as a substrate for sialylation, the ManNAc must be converted 
to SA and CMP-SA through intracellular pathways. This conversion is detected 
5 directly from externally added ManNAc by following an increase in internal SA and 
CMP-SA levels using HPAEC or thin layer chromatography (TLC) combined with 
liquid scintillation counting to detect the radiolabeled metabolites. HPAEC 
techniques have been used to quantify cellular pools of CMP-SA in as few as 6 x 10 6 
mammalian cells (Fritsch et al. (1996) Journal of Chromatography A 727:223-230), 

1 0 and TLC has been used to evaluate conversion of 14 C labeled ManNAc to sialic acid 
in bacteria (Vann et al. (1997) Glycobiology 7:697-701). If the addition of ManNAc 
leads to a significant increase in the CMP-SA levels, a limiting step exists in the 
production of ManNAc from conventional insect cell media substrates. Different 
ManNAc feeding concentrations are tested and the effect on CMP-SA levels and 

15 insect cell viability evaluated to determine if there are any deleterious effects from 
feeding the ManNAc as substrate. Conversion of ManNAc to SA through the 
aldolase pathway requires pyruvate, and the addition of cytidine can enhance CMP- 
SA production from SA (Thomas et al. (1985) Biochim, Biophys. Acta 846:37-43). 
Thus, pyruvate and cytidine are optionally added to the medium to enhance 

20 conversion of ManNAc to CMP-SA (Tomita et al. (1 995) Biochim. Biophys. Acta 
1243:329-335, Thomas etal. (1985) Biochim. Biophys. Acta 846:37-43). 

Alternative Substrates 

Other precursors substrates such as N-acetylglucosamine (GlcNAc) and 

25 glucosamine are converted to SA and CMP-SA through the ManNAc pathway in 

eukaryotic cells (Pederson et al. (1992) Cancer Res. 52:3782-3786, Kohn et al. (1962) 
J. Biol. Chem. 237:304-308). The disposition of these alternative precursor substrates 
are monitored using HPAEC analysis using techniques known in the art and compared 
with ManNAc feeding strategies to determine which substrate provides for the most 

30 efficient production of CMP-SA, in particular insect cells. 
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Example 3: Purification and cloning of CMP-SA synthetase 

A bioinformatics search of the cDNA libraries of HGS revealed a novel 
human CMP-SA synthetase gene based on its homology with the E. coli DNA 
sequence. The bacterial errzyme includes a nucleotide binding site for CTP. This 
5 binding site contains a number of amino acids that are conserved among all known 
bacterial CMP-SA synthetase enzymes (See Stoughton et al, Biochem J. 15:397-402 
(1 999). The identity of the human cDNA as a CMP-SA synthetase gene was 
confirmed by the presence of significant homology within this binding motif: 

10 bacterial sequence: IIAIIPARSGSKGL 

identity/homology + A+I AR GSKG+ 

human cDNA: ' LAALILARGGSKGI 

This human homologue commercially, publicly, or otherwise available for the 
15 purposes of this invention is cloned and expressed in insect cells. The nucleotide and 
amino acid sequences of human CMP SA synthetase are shown in Figures 29 and 30 
respectively. 

Example 4: Isolation and Inhibition of glucosaminidase 
20 It is recognized that insect cells could possess additional N- 

acetylglucosaminidase enzymes other than the enzyme responsible for generating 
low-mannose structures, so both recombinant DNA and biochemical approaches are 
implemented to isolate the target N-acetylglucosaminidase gene. PCR techniques are 
used to isolate fragments of N-acetylglucosaminidase genes by the same strategies 
25 used in isolating a/j?/za-mannosidase cDNAs from Sf-9 cells (Jarvis et al. (1997) 

Glycobiology 7:1 13-127, Kawar et al. (1997) Glycobiology 7:433-443). Degenerate 
oligonucleotide primers are designed corresponding to regions of conserved amino 
acid sequence identified in all N-acetylglucosaminidases described thus far, from 
human to bacteria, including two lepidopteran insect enzymes (Zen et al. (1996) 
30 Insect Biochem. Mol. Biol. 26:435-444). These primers are used to amplify a fragment 
of the N-acetylglucosaminidase gene(s) from genomic DNA or cDNA of lepidopteran 
insect cell lines commercially, publicly, or otherwise available for the purposes of this 
invention. A putative N-acetylglucosaminidase gene fragment from Sf9 genomic 
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DNA and from High Five™ cell (Invitrogen Corp., Carlsbad, CA, USA) cDNA has 
been identified (Figure 18). Similar techniques are used to isolate cDNAs from other 
insect cell lines of interest. The identification of cDNAs for the Sf9 or High Five™ 
N-acetylglucosaminidase facilitates the isolation of the gene in other insect cell lines. 
5 Figure 1 8 depicts PCR amplification of Sf9 genomic DNA (A) or High 

Five™cell cDNA (B) with degenerate primers corresponding to three different regions 
conserved within N-acetylglucosaminidases. These regions were designated 1, 2, and 
3, from 5 to 3'. Lane 1 (sense primer 1 and antisense primer 2); Lanes 2 (sense primer 
1 and antisense primer 3); Lanes 3 (sense primer 2 and antisense primer 3). M (size 

1 0 standards with sizes indicated in Kbp). The results show that specific fragments of N- 
acetylglucosaminidase genes were amplified from both DNAs (lanes A2 and B3). 
The specificity of the reactions is indicated by the fact that different primer pairs 
produced different amplification products from different templates. The primer 
sequences utilized in amplifying the putative N-acetylglucosaminidase gene were as 

15 follows: 

Sense primer #1 : 5'-T/C,T,I,C,A,C/T,T s G,G,C,A,C/T,A/T/C,T J I,G J T,I,GA-3' (SEQ 
ID NO:9) 

20 Sense primer #2: 5'-GA,G/A,AT,T,A/C/T,G J A ) C/T,I,I,I > C,C,I J G,G/C,I,C J A-3' (SEQ 
ID NO: 10) 

Antisense primer #2: S'-T^J^/G^J^^JJ^LG/A^^J/G/AA^/A^/T^^^' 
(SEQ ID NO:ll) 

25 

Antisense primer #3 : 5'-A,C/A/G,C/r,T,C,G/A,T,C J,C,C,I,C,C,I,I,I,G/A,T,G-3' 
(SEQ IDNOT2) 

The PCR amplified fragments are cloned and sequenced using the chain 
30 termination method (Sanger et al. (1977) Proc. Natl. Acad. Sci. USA 74:5463-5467). 
The results are used to design exact-match oligonucleotide primers to isolate an N- 
acetylglucosaminidase clone(s) from existing Sf9 and/or High Five™ lambda ZAPII 
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cDNA libraries by sibling selection and PCR (Jarvis et al. (1997) Glycobiology 7:1 13- 
127, Kawar et al. (1997) Glycobiology 7:433-443). The library is consecutively split 
into sub-pools that score positive in PCR screens until a positive sub-pool of 
approximately 2,000 clones is obtained. These clones are then screened by plaque 
5 hybridization (Benton et al. (1977) Science 196:180-182) using the cloned PCR 
fragment, and positive clones are identified and plaque purified. The cDNA(s) are 
then excised in vivo as a pBluescript-based subclone in E. coli. 

Isolation ofN-acetylglucosaminidases using biochemical techniques 

1 0 Since insect cells may have multiple N-acetylglucosaminidases, a biochemical 

purification approach is also used to broaden the search for the cDNA encoding the 
target errzyme. A polyclonal antiserum against a Manduca sexta N- 
acetylglucosaminidase (Koga et al. (1983) Manduca sexta Comparative Biochemistry 
and Physiology 74:515-520) is used to examine Sf9 and High Five™ cells for cross- 

15 reactivity. This antiserum is used to probe for the N-acetylglucosaminidase during 
biochemical isolation techniques. In addition, specific assays for N- 
acetylglucosaminidase described earlier are used to monitor enzyme activity in 
isolated biochemical fractions. 

The target N-acetylglucosaminidase is membrane bound, so it must be 

20 solubilized using a detergent such as Triton-X 100 prior to purification. Once 

solubilized, the enzyme is purified by a combination of gel filtration, ion exchange, 
and affinity chromatography. For affinity chromatography, the affinants 6- 
aminohexyl thio-N-acetylglucosaminide (Chipowsky et al. (1973) Carbohydr. Res. 
31 :339-346) or BSA modified with thio-N-acetylglucosaminide (Lee et al. (1976) 

25 Biochemistry 1 5 :3956-3963) is tried first. If necessary, 6-aminohexyl a-D-[2-(thio-2- 
amino-2-deoxy-b-D-glucosaminyl)-mannopyranodside or other thio-oligosaccharides 
are synthesized and used as affinants. Affinity matrices are prepared using 
commercially available products. 

Alternatively, the target enzyme is "anchored" to the membrane by a 

30 glycophosphoinositide. In such a case, a specific phospholipase C is used to release 
the active enzyme from the membrane, and the use of detergent for solubilization is 
avoided. 
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The purity of the enzyme is examined with SDS-PAGE and mass 
spectroscopy, and the activity of the enzyme characterized. Once the enzyme is 
sufficiently purified, its amino-terminal region is sequenced by conventional Edman 
degradation techniques, available commercially. If N-terminal blockage is 
5 encountered, the purified protein are digested, peptides purified, and these peptides 
are used to obtain internal amino acid sequences. The resulting sequence information 
is used to design degenerate oligonucleotide primers that are used, in turn, to isolate 
cDNAs as described above. 

1 0 Expression of glucosaminidase 

Isolated full-length cDNAs are sequenced, compared to other N- 
acetylglucosaminidase cDNAs, and expressed using known polyhedrin-based 
baculovirus vectors. The overexpressed proteins are purified, their biochemical 
activities and substrate specificities characterized, and new polyclonal antisera is 

15 produced to establish the subcellular locations of the enzymes in insect cells. The 
locations are optionally identified by using the antisera in conjunction with secretory 
pathway markers, including Golgi and endoplasmic reticulum specific dyes and GFP- 
tagged N-glycan processing enzymes commercially, publicly, or otherwise available 
for the purposes of this invention. Evaluation of the N-glycan structures on secreted 

20 glycoproteins from insect cells overexpressing the glucosaminidase gene 

demonstrates the involvement of this enzyme in N-glycan processing as opposed to 
lysosomal degradation, a common activity for other glucosaminidases. 

Example 5: Expression of the model glycoprotein transferrin 
25 The gene encoding human transferrin as described in Genbank accession No. 

S95936 is cloned into the baculovirus vector, expressed in multiple insect cell lines, 
and purified to homogeneity. Figure 26 shows SDS-PAGE of transferrin from insect 
cells (M=unpurified lysates, P=purified protein). Similar techniques are used to 
express and purify this glycoprotein in the target cell line(s) of interest following 
30 manipulation of the glycosyltransferase and CMP-SA production pathways. 

Once the transferrin is purified to homogeneity, the structures of the 
oligosaccharides which are N-linked at two sites of the transferrin are analyzed using 
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3-dimensional HPLC mapping techniques. Over 450 N-glycans have been mapped 
with this technique. For example, characterization of the N-linked oligosaccharides 
attached to the heavy chain of secreted and intracellular IgG is described. 
Confirmation of particular carbohydrate structures is provided by treating the 
5 oligosaccharides with glycosidases and re-eluting through the HPLC columns. 
Additional structural information on unknown oligosaccharides are obtained using 
mass spectrometry and NMR techniques previously used for analysis of IgG 
glycoforms(Hsuefa/. (1997) J. Biol. Chem. 272:9062-9070). 

These analytical techniques allow the identification and quantification of N- 

10 glycans to determine if a fraction of these structures are sialylated oligosaccharides. 
Sialylation is confirmed by treating the purified N-glycan with sialidase from A 
ureafaciens and measuring the release of sialic acid using HPAEC-PAD. 

The present invention now will be described more fully hereinafter with 
reference to the accompanying drawings, in which preferred embodiments of the 

1 5 invention are shown. This invention may, however, be embodied in many different 
forms and should not be construed as limited to the embodiments set forth herein; 
rather, these embodiments are provided so that this disclosure will be thorough and 
complete, and will fully convey the scope of the invention to those skilled in the art. 
Like numbers refer to like elements throughout. 

20 Many modifications and other embodiments of the invention will come to 

mind to one skilled in the art to which this invention pertains having the benefit of the 
teachings presented in the foregoing descriptions and the associated drawings. 
Therefore, it is to be understood that the invention is not to be limited to the specific 
embodiments disclosed and that modifications and other embodiments are intended to 

25 be included within the scope of the appended claims. Although specific terms are 
employed herein, they are used in a generic and descriptive sense only and not for 
purposes of limitation. 



Example 6: Cloning, expression, and characterization of the human sialic acid 
30 synthetase (SAS) gene and gene product. 

This example reports the cloning and characterization of a novel human gene 
having homology to the Escherichia coli sialic acid synthetase gene (neuB). This 



WO 00/52135 



75 



PCT/US00/05313 



human gene is ubiquitously expressed and encodes a 40 kD enzyme which results in 
jV-acetylneuraminic acid (Neu5Ac) and 2-keto-3-deoxy-D-g/ycero-D-gatoo-nononic 
acid (KDN) production in insect cells upon recombinant baculovirus infection. In 
vitro the human enzyme uses A/-acerylmannosamine-6-phosphate and mannose-6- 
5 phosphate as substrates to generate phosphorylated forms of Neu5Ac and KDN, 

respectively, but exhibits much higher activity toward the Neu5Ac phosphate product. 

In order to identify genes involved in sialic acid biosynthesis in eukaryotes, 
homology searches of a human expressed sequence tag (EST) database were 
performed using the E. coli sialic acid synthetase gene. A cDNA of approximately 1 

10 kb with a predicted open reading frame (ORF) of 359 amino acids was identified. 
Northern blot analysis indicated that the mRNA is ubiquitously expressed, and in 
vitro transcription and translation along with recombinant expression in insect cells 
demonstrated that the human sialic acid synthetase (SAS) gene encodes a 40 kD 
protein. SAS rescued an E. coli neuB mutant although less efficiently than neuB. 

1 5 Neu5Ac production in insect culture supplemented with ManNAc further supported 
the role of SAS in sialic acid biosynthesis. In addition to Neu5Ac, a second sialic 
acid, KDN, was generated suggesting that the human enzyme has broad substrate 
specificity. The human enzyme (SAS), unlike its E. coli homologue, uses 
phosphorylated substrates to generate phosphorylated sialic acids and thus likely 

20 represents the previously described sialic acid-9-phosphate synthetase of mammalian 
cells (Watson et al., J. Biol Chem. 241, 5627-5636 (1966)). 

Identification of a Human Sialic Acid Synthetase Gene 

The E. coli sialic acid synthetase gene (Annunziato et al., J. Bacterial. 177, 

25 3 1 2-3 1 9 (1 995)) was used to search the human EST database of Human Genome 
Sciences, Inc. (Rockville, MD). One EST with significant homology to the neuB 
gene was found in a human liver cDNA library and used to identify a full length 
cDNA (Figure 3 5 A) with an ORF homologous to the bacterial synthetase over most 
of its length. The putative synthetase consisted of 359 amino acids (SEQ ID NO:6) 

30 while the neuB gene product contained 346 amino acids (SEQ ID NO:8). Alignment 
of the human against the bacterial enzyme demonstrated that significant differences 
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were found primarily in the N-tenrunus (Figure 35B). Overall, the two synthetases 
were found to be 36.1% identical and 56.1% similar at the amino acid level. 

The product of a cDNA amplification with a T7 promoter was expressed by in 
vitro transcription and translation using rabbit reticulocyte lysates. The generation of 
5 an -40 kD protein, consistent with a predicted molecular weight of 40.3 kD, 
confirmed the existence of an ORF (Figure 36A, lane 2). The negative control, 
namely the vector without an insert, did not produce a protein product (Figure 36A, 
lane 1). Northern blot analysis was performed on poly-A+ RNA blots representing a 
selection of human tissues (Figure 36B). The full-length cDNA was radio-labeled 
1 0 and used as probe. A band of expected size, ~1 .3 kb, was observed in all tissues tested 
suggesting that the putative synthetase is ubiquitously expressed. 

Expression and Purification of Human Sialic Acid Synthetase 

SAS was inserted into baculovirus under the polh promoter using lacZ as a 

15 positive selection marker. After transfection and viral titering, the resulting virus 
(AcSAS) was used to infect Spodoptera frugiperda (Sf-9) cells followed by pulse 
labeling. An -40 kD band was observed in the Sf-9 lysates from cells infected by 
AcSAS (Figure 3 6 A, lane 5) and not in the mock infected control (Figure 3 6 A, lane 
4). Furthermore, this band co-migrated with the protein produced in vitro. To verify 

20 SAS expression, the band was visualized in the non-nuclear fraction (Miyamoto et al., 
Mol. Cell. Biol. 5, 2860-2865 (1985)) after electrophoretic transfer to a ProBlott™ 
membrane and Ponceau S staining (data not shown) and excised for amino acid 
sequencing. The five N-terminal amino acids were identical to the second through 
sixth amino acids of the predicted protein (data not shown). Interestingly, the initiator 

25 methionine was also removed from the purified recombinant E. coli sialic acid 
synthetase (Vann et al., 1997). 

In Vivo Activity of Human Sialic Acid Synthetase 

Covalent labeling of sialic acids with the fluorescent reagent l,2-diamino-4,5- 
30 methylene dioxybenzene dihydrochloride (DMB) allows very specific and sensitive 
sialic acid detection (Hara et al., Anal. Biochem. 179, 162-166 (1989); Manzi et al., 
Anal. Biochem. 188, 20-32 (1990)). The DMB reaction products are identified after 
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separation by reverse phase HPLC chromatography. Using this technique, sialic acid 
standards were measured in quantities as low as 50 fmol (data not shown). Sialic acid 
levels of an insect cell line (Sf-9) and a mammalian cell line (Chinese hamster ovary, 
CHO) were compared (Table 2). The sialic acid content in cell ly sates before and 
5 after filtration through a 10,000 MWCO membrane was determined by DMB labeling 
and HPLC separation. The native sialic acid levels in Sf-9 cells grown without fetal 
bovine serum (FBS) supplementation are substantially lower than the levels found in 
CHO cells (Table 2; Figure 37A). To ensure that the low sialic acid content was not 
due to the absence of serum, the sialic acid content of insect cells cultured in 10% 
10 FBS was determined. Even with FBS addition, the Neu5 Ac content of Sf-9 cells is 
nearly an order of magnitude lower than the content of CHO cells (Table 2). The 
origin of the sialic acid detected in insect cells, whether natively produced or the 
result of contamination from the media, is not clear since even serum free insect cell 
media contains significant levels of sialic acid (data not shown). 

15 



Table 2. Sialic Acid Content of CHO and Sf-9 Cell Lines 


KDN (fmol/ug protein) 


Neu5Ac (fmol/ug protein) 


+ Filtration - Filtration 


+ Filtration 


- Filtration 


Sf-9 


20 


30 


Sf-9 + FBS 


80 


600 


CHO 70 100 


900 


4,200 



CHO and Sf-9 cells were grown to confiuency in T-75 flasks. Cell lysates with and 
without 10,000 MWCO filtration were analyzed for sialic acid content following DMB 
derivatization and HPLC separation. Sialic acid levels have been normalized based on 
lysate protein content. Dashes indicate sialic acid was not detectable. 



The lack of large sialic acid pools in Sf-9 cells grown in serum-free media 
facilitated the detection of sialic acids produced by recombinant enzymes. In order to 
examine the production of sialic acids from cells infected with recombinant virus, Sf- 
20 9 cells were infected with AcSAS and a negative control virus, A3 5. The A3 5 virus 
was generated by recombining a transfer vector without a gene inserted downstream 
of the polh promoter. Low levels of Neu5Ac were observed in lysates from insect 
cells infected by either virus (Figure 37B) indicating additional Neu5Ac was not 
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produced following the expression of SAS. However, a significant new peak was 
seen in AcSAS lysates at 12.5 min. that was not observed in A35 negative control 
lysates (Figure 37B). Published chromatograms suggested the unknown early eluting 
peak could be TV^glycolylneuraminic acid (Neu5Gc) or KDN (Inoue et al., 1998). The 
5 elution time of the unknown peak was the same as DMB-derivatized KDN standard 
(Figure 37B) and co-chromatographed with authentic DMB-KDN (data not shown) 
confirming KDN generation in AcSAS infected Sf-9 cells. KDN was not detected in 
uninfected Sf-9 cells either with or without FBS supplementation (Table 2). 

In a further attempt to demonstrate Neu5Ac synthetic functionality, the culture 

10 media was supplemented with ManNAc. the metabolic precursor of Neu5 Ac. In 
addition to a DMB-KDN peak, a prominent peak eluting at 17.5 min. corresponding 
with that of the Neu5Ac standard was observed from the lysates of ManNAc 
supplemented Sf-9 cells infected with AcSAS (Figure 37C). Neu5Ac quantities were 
more than 100 times lower in the uninfected lysates and even less in A3 5 infected 

15 lysates (Table 2). 

Sialic acid levels were quantified in lysates of uninfected, A3 5 infected, and 
AcSAS infected Sf-9 cells grown in media with and without Man, mannosamine 
(ManN), or ManNAc supplementation (Table 3). In uninfected cells, Man feeding 
resulted in detection of KDN slightly above background, and ManNAc feeding 

20 marginally increased Neu5Ac levels in uninfected and A35 infected cells (Table 3). 
ManN supplementation had no effect on KDN levels but increased Neu5Ac levels 
(Table 3). The most significant changes in sialic acid levels occurred with AcSAS 
infection. AcSAS infection of Sf-9 cells led to large increases in KDN levels with 
slight enhancements upon Man or ManNAc supplementation. Both AcSAS infection 

25 and ManNAc feeding were required to obtain substantial Neu5Ac levels. 



Table 3. Sialic Acid Content of Sf-9 with Media Supplementation 



KDN (fmol/|iig protein) Neu5Ac (fmol/ug protein) 



Feeding: 


None 


Man 


ManN 


ManNAc 


None 


Man 


ManN 


ManNAc 


No Infection 




20 






30 


20 


60 


140 


A3 5 










80 


80 


100 


120 


AcSAS 


5,300 


7,600 


5,200 


6,300 


50 


40 


200 


27,000 
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Uninfected, A3 5 infected, and AcSAS infected Sf-9 cells were grown in unsupplemented media 
and media that was supplemented with 10 mM Man, ManN, or ManNAc. Cell lysates were 
analyzed for KDN and Neu5Ac content using DMB derivatization and HPLC separation. Sialic 
acid levels have been normalized based on lysate protein content. Dashes indicate sialic acid was 
not detectable. 



The presence of KDN and Neu5Ac in AcSAS lysates has been confirmed by 
high-performance anion-exchange chromatography (HPAEC) with a pulsed 
amperometric detector (Figure 37D). When culture media is supplemented with 
5 ManNAc, peaks with elution times corresponding to authentic KDN and Neu5Ac 
standards are seen in AcSAS infected lysates that are absent in A3 5 infected lysates. 
Neu5Ac aldolase has been demonstrated previously to break Neu5Ac into ManNAc 
and pyruvic acid (Comb and Roseman, J. Biol. Chem. 235, 2529-2537 (I960)) and 
KDN into Man and pyruvic acid (Nadano et al., J. Biol. Chem. 261, 1 1550-1 1557 
10 (1986)). KDN and Neu5Ac disappear from the AcSAS lysates after aldolase 

treatment (Figure 37D). A similar disappearance of the sialic acid peaks following 
aldolase treatment was observed using DMB-labeling and HPLC analysis (data not 
shown). 

1 5 In Vitro A ctivity of Human Sialic A cid Synthetase 

The mammalian pathway for Neu5Ac synthesis uses a phosphate 
intermediate (Jourdian et al., J. Biol. Chem. 239, PC2714-PC2716 (1964); Kundig et 
al., J. Biol. Chem. 241, 5619-5626 (1966); Watson et al., J. Biol. Chem. 241, 5627- 
5636 (1966)) while the E. coli pathway directly converts ManNAc and PEP to 

20 Neu5Ac (Vann et al., Glycobiology 1, 697-701 (1997)). In order to determine which 
substrates are used by the human enzyme, in vitro assays were performed using 
lysates of infected Sf-9 cells and protein purified from the prokaryotic expression 
system. Lysates or purified protein plus PEP and MnCk (Angata et al., J. Biol. Chem. 
21 A, 22949-22956 (1999)) were incubated with Man, mannose-6-phosphate (Man-6- 

25 P), ManNAc, or ManNAc-6-P followed by DMB labeling and HPLC analysis. 

AcSAS infected cell lysates incubated with ManNAc-6-P and PEP produced a 
peak eluting at 5.5 min (Figure 38A) consistent with phosphorylated sugars. In 
previous studies, phosphorylated KDN was detected as DMB-KDN after alkaline 
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phosphatase (AP) treatment and DMB derivatization (Angata et al., J. Biol. Chem. 
274, 22949-22956 (1999)). Similarly, the peak eluting at 5.5 min. was exchanged for 
one that eluted at the same time as authentic Neu5Ac following AP treatment (Figure 
3 8 A). Likewise, an early eluting peak from the incubation mixture containing Man-6- 
5 P yielded a KDN peak after AP treatment (Figure 38B). No sialic acid products were 
detected when A3 5 infected cell lysates were used in the equivalent assays or when 
Man or ManNAc were used as substrates (data not shown). 

Assays were performed by incubating lysates with different substrate solution 
concentrations of Man-6-P and ManNAc-6-P in order to evaluate substrate 

10 preference. After incubation for a fixed time period, the samples were treated with 
AP, and DMB derivatives of Neu5Ac and KDN were quantified and compared (Table 
4). When equimolar amounts of substrates are used, Neu5Ac production is 
significantly favored over KDN especially at higher equimolar concentrations (10 and 
20 mM) of the two substrates. Only when the substrate concentration of ManNAc-6- 

15 P is substantially lower than the Man-6-P levels are production levels of the two sialic 
acids comparable. When the ManNAc-6-P concentration is 1 mM and the Man-6-P 
level is 20 mM, the Neu5Ac:KDN production ratio approaches unity. Therefore, the 
enzyme prefers ManNAc-6-P over Man-6-P in the production of phosphorylated 
forms of Neu5Ac and KDN, respectively. 

20 

Table 4. Competitive Formation of Neu5Ac and KDN 

Concentration in Substrate Solution (mM) Final Concentration (pmol/ul) Neu5Ac/KDN 
Man-6-P ManNAc-6-P KDN Neu5Ac Ratio 



1 


1 


8 


33 


4.2 


5 


1 


19 


47 


2.5 


10 


1 


33 


53 


1.6 


20 


1 


56 


60 


1.1 


5 


5 


14 


190 


14 


10 


10 


18 


440 


24 


20 


20 


16 


820 


51 


20 


5 


40 


300 


7.6 


20 


10 


18 


470 


25 
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Lysates from AcSAS infected Sf-9 cells were incubated with substrate solutions 
containing the indicated concentrations of Man-6-P and ManNAc-6-P. After incubation 
and AP treatment, samples were analyzed for KDN and Neu5Ac content using DMB 
derivatization and HPLC separation. Neu5Ac and KDN concentrations of the final 
solution (50 ul) and the Neu5Ac/KDN ratio are reported. 



Discussion of Human Sialic Acid Synthetase Characterization 

We have identified the sequence of a human sialic acid phosphate synthetase 
5 gene, SAS, whose protein product condenses ManNAc-6-P or Man-6-P with PEP to 
form Neu5Ac and KDN phosphates, respectively. To our knowledge, this is the first 
report of the cloning of a eukaryotic sialic acid phosphate synthetase gene. Despite 
the importance of sialic acids in many biological recognition phenomena, sialic acid 
phosphate synthetase genes have not been cloned because the enzymes they encode 

1 0 are unstable and difficult to purify (Watson et al., J. Biol. Chem. 241, 5627-5636 
(1966); Angata et al., J. Biol. Chem. 274, 22949-22956 (1999)). Even the E. coli 
sialic acid synthetase enzyme, whose sequence is known, has low specific activity and 
is unstable (Vann et al., Glycobiology 7, 697-701 (1997)). 

Consequently, a bioinformatics approach based on the E. coli synthetase 

1 5 sequence was used to identify a putative human gene 36% identical and 56% similar 
to neuB. In vitro transcription and translation verified an open reading frame which 
encoded a 359 amino acid protein. In addition, Northern blots revealed ubiquitous 
transcription of the human synthetase gene in a selection of human tissues. The wide 
distribution of SAS mRNA is consistent with the detection of sialic acids in many 

20 different mammalian tissues (Inoue and Inoue, Sialobiology and Other Novel Forms 
of Glycosylation (Osaka, Japan: Gakushin Publishing) pp.57-67 (1999)). 

Using the baculovirus expression system, the 40 kD sialic acid phosphate 
synthetase enzyme, SAS, was expressed in cells. The use of Sf-9 cells which have 
little if any native sialic acid greatly facilitated the detection of sialic acids and the 

25 characterization of SAS. However, Neu5Ac was observed only when insect cells 
were infected with AcSAS and the cell culture media was supplemented with 
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ManNAc, a sialic acid precursor. This ManNAc feeding requirement indicates that 
Sf-9 cells may lack sizeable ManNAc pools and synthetic pathways. 

SAS was identified based on homology with neuB whose enzyme product 
directly forms Neu5Ac from ManNAc and PEP (Vann et al., Glycobiology 1, 697-701 
5 (1 997)). Furthermore, insect cells produce Neu5Ac following recombinant SAS 
expression and ManNAc supplementation. However, mammalian cells are known 
only to produce Neu5Ac from ManNAc through a three-step pathway with 
phosphorylated intermediates. Therefore, in vitro assays were performed to determine 
the substrate specificity of SAS. Both AcSAS infected insect cell lysates and protein 

10 purified from the prokaryotic expression system were assayed using ManNAc and 
ManNAc-6-P as possible substrates. A rapidly eluting DMB derivatized product, 
typical of a phosphorylated sialic acid, was observed only when ManNAc-6-P was 
used as the substrate. Furthermore, this peak disappears with the appearance of an 
unsubstituted DMB-Neu5Ac peak following AP treatment. SAS therefore condenses 

1 5 PEP and ManNAc-6-P to form a Neu5 Ac phosphate product. Although the exact 
position of the phosphorylated carbon on the product has not yet been specified, SAS 
is likely the sialic acid phosphate synthetase enzyme of the previously described 
three-step mammalian pathway (Kundig et al., J. Biol. Chem. 241, 5619-5626 (1966); 
Watson et al., J. Biol. Chem. 241, 5627-5636 (1966); Jourdian et al., J. Biol. Chem. 

20 239, PC2714-PC2716 (1964)). Despite little if any native pools of sialic acids, Sf-9 
cells natively possess the ability to complete the three-step mammalian pathway when 
only the sialic acid phosphate synthetase gene is provided. Sf-9 cells have been 
shown to have substantial ManNAc kinase ability (Effertz et al., J. Biol Chem. 21 A, 
28771-28778 (1999)), and phosphatase activity has also been detected in insect cells 

25 (Sukhanova et al., Genetika 34, 1239-1242 (1 998)). 

The capacity to produce sialic acids in Sf-9 cells following AcSAS infection 
and ManNAc supplementation at levels even higher than those seen in a mammalian 
cell lines such as CHO may help overcome a major limitation of the baculovirus 
expression system. N-glycans of recombinant glycoproteins produced in insect cells 

30 lack significant levels of terminal sialic acid residues (Jarvis and Finn, Virology 212, 
500-511 (1995); Ogonah et al, Bio/Technology 14, 197-202 (1996)). The lack of 
sialylation on human thyrotropin produced by the baculovirus expression system 
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resulted in rapid in vivo thyrotropin clearance as compared to thyrotropin produced by 
a mammalian system (Grossmann et al., Endocrinology 138, 92-100 (1997)). 
Generation of significant sialic acid pools along with expression of other genes such 
as sialyltransferases may lead to production of significant levels of sialylated 
5 glycoproteins in insect cells. 

Another interesting observation was the occurrence of a second DMB reactive 
peak in AcSAS infected Sf-9 lysates. This peak has been identified as KDN, a 
deaminatedNeu5Ac. We subsequently demonstrated that the SAS enzyme generates 
KDN phosphate from Man-6-P and PEP in vitro. While Neu5Ac production in insect 

1 0 cells requires both AcSAS infection and ManNAc supplementation, only AcSAS 
infection is necessary for KDN synthesis. Therefore, significant substrate pools for 
the generation of KDN already exist in insect cells or are present in the media. In 
addition, mannose feeding increased KDN production even further. Interestingly, 
Man feeding of the uninfected insect cells increased KDN levels above background, 

1 5 and ManNAc feeding also led to higher Neu5Ac levels in uninfected cells. Therefore, 
insect cells may possess limited native sialic acid synthetic ability. Similar substrate 
supplementation results have been reported in mammalian cells, as cultivation in 
Man-rich or ManNAc-rich media enhanced the synthesis of native intracellular KDN 
and Neu5Ac, respectively (Angata et al., Biochem. Biophys. Res. Commun. 261, 326- 

20 331 (1999)). 

This study is the first report of a eukaryotic gene encoding any enzyme with 
KDN synthetic ability. Recently, KDN enzymatic activity has been characterized in 
trout testis, a tissue high in KDN content. KDN is synthesized from Man in trout 
through a three-step pathway involving a synthetase with a Man-6-P substrate 

25 (Angata et al., J. Biol. Chem. 274, 22949-22956 (1999)). However, the fish 

synthetase enzyme, partially purified from trout testis, was approximately 80 kD as 
compared to the human enzyme of 40 kD. Furthermore, KDN and Neu5Ac phosphate 
synthesis in trout were likely catalyzed by two separate synthetase activities (Angata 
et al., J. Biol. Chem. 274, 22949-22956 (1999)) while the current study indicates that 

30 both products were generated from a single human enzyme with broad substrate 
specificity. 
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Neu5Ac, usually bound to glycoconjugates, is the predominant sialic acid 
found in mammalian tissue, but KDN, primarily found free in the ethanol soluble 
fractions, has also been detected all human tissues examined so far (Inoue and Inoue, 
Sialobiology and Other Novel Forms of Glycosylation (Osaka, Japan: Gakushin 
5 Publishing, pp.57-67 (1999)). The ratio of Neu5Ac to KDN is on the order of 100: 1 
in blood cells and ovaries (Inoue et al., 1998), although this ratio may change during 
development and cancer. The levels of free KDN in newborn fetal cord red blood 
cells are higher than those of maternal red blood cells (Inoue et al., J. Biol. Chem. 
273, 27199-27204 (1998)). Furthermore, a 4.2 fold increase in the ratio of free KDN 

10 to free Neu5Ac was observed in ovarian tumor cells as compared to normal cells, and 
the ratio appears to increase with the extent of invasion or malignancy for ovarian 
adenocarcinomas (Inoue et al, J. Biol. Chem. 273, 27199-27204 (1998)). 

Because the KDN/Neu5Ac ratio has biological significance, we performed 
competitive in vitro assays with insect cell lysates using both ManNAc-6-P and Man- 

1 5 6-P as substrates. SAS demonstrated a preference for phosphorylated Neu5Ac over 
phosphorylated KDN synthesis in vitro, although the concentrations of the particular 
substrates relative to the enzyme level altered this production ratio. Thus changes in 
the ratios of free KDN to Neu5Ac observed in different developmental states and 
cancer tissue may reflect variability either in the levels of specific substrates or the 

20 amount of active enzyme present in vivo. The identification of the SAS genetic 
sequence and characterization of the enzyme it encodes should help further our 
understanding of sialic acid biosynthesis as well as the roles sialic acids play in 
development and disease states. 

In Figure 39 the production of sialylated nucleotides in SF-9 insect cells 

25 following infection with human CMP-SA synthetase and SA synthetase containing 
baculoviruses is demonstrated. Sf-9 cells were grown in six well plates and infected 
with baculovirus containing CMP-SA synthase and supplemented with 10 mM 
ManNAc ("CMP" line), baculovirus containing CMP-SA synthase and SA synthase 
plus 10 mM ManNAc supplementation ("CMP+SA" line), or no baculovirus and no 

30 ManNAc supplementation ("SF9" line). The nucleotide sugars from lysed cells were 
extracted with 75% ethanol, dried, resuspended in water, and filtered through a 1 0,000 
molecular weight cut-off membrane. Samples were then separated on a Dionex 
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Carbopac PA-1 column using a Shimadzu VP series HPLC. Nucleotide sugars were 
detected based upon their absorbance at 280 nm, and CMP sialic acid standards were 
shown to elute at approximately 7 minutes. These results demonstrate the ability to 
produce the desired oligosaccharide products in insect cells via introduction and 
5 expression of sialyltransferase enzymes. 

Materials and Method of Example 6 
Gene Characterization 

The E. coli neuB coding sequence was used to query the Human Genome 

1 0 Sciences (Rockville, MD) cDNA database with BLAST software. One EST clone, 
HMKAK61, from a human (liver) cDNA library demonstrated significant homology 
to neuB and was chosen for further characterization. The tissue distribution profile 
was determined by Northern blot hybridization. Briefly, the cDNA was radio-labeled 
with [ 32 P]-dCTP using a RediPrime™II kit (Amersham/Pharmacia Biotech, 

15 Piscataway, NJ) following the manufacturer's directions. Multiple tissue Northern ■ 
blots containing poly-A+ RNA (Clontech, Palo Alto, CA) were pre-hybridized at 
42°C for 4 hours and then hybridized overnight with radio-labeled probe at lxl 0 6 
CPM/ml. The blots were sequentially washed twice for 15 min. at 42°C and once for 
20 min. at 65°C in 0.1X SSC, 0.1% SDS and subsequently autoradiographed. 

20 

Baculovirus Cloning and Protein Expression 

The full length ORF was amplified by PCR using the following primers. The 
forward primer, 5'- 

TGTAATACGACTCACTATAGGGC GG4rCCGCCATCATGCCGCTGGAGCTG 
25 GAGC (SEQ ID NO:13) contained a synthetic T7 promoter sequence (underlined), a 
BamHI site (italics), a KOZAK sequence (bold), and sequence corresponding to the 
first six codons of SAS. The minus strand primer, 5'- 

GTACGGrJC CTTATTA AGACTTGATTTTTTTGCC (SEQ.ID NO:14), contained 
an Asp 718 site (italics), two in-frame stop codons (underlined), and sequences 
30 representing the last six codons of SAS. 

After amplification, the PCR product was digested with BamHI and Asp 718 
(Roche, Indianapolis, IN) and the resulting fragment cloned into the corresponding 
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sites of the baculovirus transfer vector, pA2. Following DNA sequence confirmation, 
the plasmid (pA2-SAS) was transfected into Sf-9 cells to generate the recombinant 
baculovirus AcSAS as previously described (Coleman et al., Gene 190, 163-171 
(1997)). Amplified virus was used to infect cells, and the gene product was radio- 
5 labeled with [ 35 S]-Met and [ 35 S]-Cys. Bands corresponding to the gene product were 
visualized by SDS-PAGE and autoradiography. Alternatively, the PCR product was 
used as a template for in vitro transcription and translation using rabbit reticulocyte 
lysate (Promega, Madison, WI) in the presence of [ 35 S]-Met. Translation products 
were resolved by SDS-PAGE and visualized by autoradiography. 

10 For protein production, Sf-9 cells were seeded in serum-free media at a 

density of lxlO 6 cells/ml in spinner flasks and infected at a multiplicity of infection of 
1-2 with the recombinant virus. A detergent fractionation procedure was employed 
(Miyamoto et al., Mol. Cell. Biol. 5, 2860-2865 (1985)) to separate nuclear from non- 
nuclear fractions. Protein was resolved by SDS-PAGE, transferred to a ProBlott™ 

15 membrane (ABI, Foster City, CA), and visualized by Ponceau S staining. A 

prominent band at the expected MW of ~40 kD was visible and excised for protein 
microsequencing using an ABI-494 sequencer (PE Biosystems, Foster City, CA). 

Neu5Ac/KDN Detection 

20 Sialic acid was measured by the procedure of Hara et al. (Anal. Biochem. 179, 

162-166 (1989). Ten microliters of sample were treated with 200 |il DMB (Sigma 
Chemicals, St. Louis, MO) solution (7.0 mM DMB in 1.4 M acetic acid, 0.75 M |3- 
mercaptoethanol, and 18 mM sodium hydrosulfite) at 50°C for 2.5 hrs, from which 10 
|ll was used for HPLC analysis on a Shimadzu (Columbia, MD) VP series HPLC 

25 using a Waters (Milford, MA) Spherisorb 5 urn ODS2 column. Peaks were detected 
using a Shimadzu RF-10AXL fluorescence detector with 448 nm emission and 373 
nm excitation wavelengths. The mobile phase was an acetonitrile, methanol, and 
water mixture (9:7:84, v/v) with a flow rate of 0.7 ml/min. Response factors of 
Neu5Ac and KDN were established with authentic standards based on peak areas for 

30 quantifying sample sialic acid levels. Sialic acid content was normalized based on 
protein content measured with the Pierce (Rockford, IL) BCA assay kit and a 
Molecular Devices (Sunnyvale, CA) microplate reader. 
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Cell Culture and Sialic Acid Quantification 

Sf-9 (ATCC, Manassas, VA) cells were grown in Ex-Cell™ 405 media (JRH 
Bioscience, Lenexa, KS) with and without 10% FBS at 27°C. CHO-K1 cells (ATCC, 
5 Manassas, VA) were cultured at 37°C in a humidified atmosphere with 5% CO2 in 
Dulbecco's Modified Eagle Medium (Life Technologies, Rockville, MD) 
supplemented with 10% FBS, 100 U/ml penicillin, 100 Ug/ml streptomycin, 100 uM 
MEM essential amino acids, and 4 mM L-glutamine (Life Technologies, Rockville, 
MD). Cells were grown to confluency in T-75 flasks, washed twice with PBS, and 

10 lysed in 0.05 M bicine, pH 8 . 5 , with 1 mM DTT (Vann et al. , Glycobiology 7, 697- 
701 (1997)) using a Tekmar Sonic Disrupter (Cincinnati, OH). For determination of 
sialic acid content, 10 (il of lysates with and without 10,000 MWCO microfiltration 
(Millipore, Bedford, MA) were analyzed by DMB derivatization as described above. 
Sugar substrate feeding was studied by plating approximately 10 6 Sf-9 cells on 

1 5 each well of a six well plate. Media was replaced with 2 ml fresh media 

supplemented with 10 mM sterile-filtered Man, ManN, or ManNAc. Cells were left 
uninfected or infected with 20 |xl of the appropriate (A3 5 or AcSAS) amplified 
baculovirus stock. Cells were harvested at 80 hours post infection by separating the 
pellet from the media by centrifugation and washing twice with PBS. Cells were 

20 lysed and analyzed for sialic acid content as described above. 

In vitro Activity 

In vitro activity assays were based on the procedure of Angata et al. (J. Biol. 
Chem. 21 A, 22949-22956 (1999)). Lysates were prepared from A35 and AcSAS 

25 infected and uninfected Sf-9 cells cultured in T-75 flasks with and without 10 mM 
ManNAc supplementation. After washing twice with PBS, cells were lysed on ice 
with 25 strokes of a tight-fitting Dounce homogenizer (Wheaton, Millville, NJ) in 2.5 
ml lysis buffer [50mM HEPES pH = 7.0 with 1 mM DTT, leupeptin (1 Ug/ml), 
antipain (0.5 |lg/ml), benzamidine-HCl (15.6 jig/ml), aprotinin (0.5 (ig/ml), 

30 chymostatin (0.5 jug/ml), and 1 mM phenyhnethylsulfonylfluoride]. 5 |il of substrate 
solution was incubated with either 20 (ill insect cell lysate (30 min.) or purified E. coli 
protein (60 min.) at 37°C. The substrate solution contained 1 0 mM MnCl 2 , 20 mM 
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PEP, and either 5 mM ManNAc-6-P or 25 mM Man-6-P (Sigma, St. Louis, MO). 
ManNAc-6-P was prepared by acid hydrolysis of meningococcal Group A 
polysaccharide. The polysaccharide (15.5 mg) in 5.8 ml water was mixed with 770 
mg of Dowex 50 H+ and heated for 1 hr. at 100°C. The filtered hydrolysate was dried 
5 in vacuo and the residue dissolved to give a solution of 50 mM ManNAc-6-P and 
stored frozen. Substrate solutions containing 25 mM Man and MariNAc were also 
used. Boiled samples were used as negative controls. Following incubation, all 
samples were boiled 3 min., centrifuged for 10 min. at 12,000g, and split into two 10 
|ll aliquots. One aliquot was treated with 9 units of calf intestine alkaline phosphatase 

10 (Roche, Indianapolis, IN) along with 3 (xlof accompanying buffer while the other 
aliquot was diluted with water and buffer. AP treated aliquots were incubated 4 hrs. 
at 37°C, and 1 0 (xl of both AP treated and untreated samples were reacted with DMB 
as described above. 2 jil of the samples incubated with insect lysates and 10 [il of the 
samples incubated with bacterial protein were injected onto the HPLC for sialic acid 

15 analysis as described above. 

For substrate competition experiments, Man-6-P and ManNAc-6-P 
concentrations in the substrate solution were varied from 1 to 20 mM. In vitro assays 
were run with Sf-9 lysates as described above. Samples were treated with 7 jul buffer 
and 18 units of AP, incubated for 4 hrs. at 37°C, and analyzed for sialic acid content.: 

20 Samples containing more than 1 mM ManNAc-6-P in the substrate solution produced 
high levels of sialic acid and were diluted 1:5 before injection to avoid fluorescence 
detector signal saturation. 

Analysis with Aldolase Using HPAEC 

25 Sf-9 cells were grown in T-75 flasks and then infected with A3 5 or AcSAS or 

left uninfected in the presence or absence of 10 mM ManNAc. After 80 hrs., cells 
were washed twice in PBS and sonicated. Aliquots (200 ul ) were filtered through 
10,000 MWCO membranes, and 50 (il samples were treated with 12.5 |ll aldolase 
solution [0.0055 U aldolase (ICN, Costa Mesa, CA), 1.4 mMNADH (Sigma, St. 

30 Louis, MO), 0.5 M HEPES pH 7.5, 0.7 U lactate dehydrogenase (Roche, Indianapolis, 
IN)] or left untreated and incubated at 37°C for one hour (Lilley et al., 1992). 
Samples were analyzed by HPAEC with a Dionex (Sunnyvale, CA) BioLC system 
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using a pulsed amperometric detector (PAD-II) on a Carbopac PA-1 column. The 
initial elution composition was 50% A (200 mM NaOH), 45% B (water), and 5% C 
(lMNaOAc, 200 mM NaOH) with a linear gradient to 50% A, 25% B, and 25% C at 
20 min. A 6 min. 50% A and 50 % C washing followed. Samples were normalized 
5 based on protein content by dilution with water, and 20 u\l of each sample were 
analyzed. Ten (j,l of each sample were also derivatized with DMB and analyzed by 
HPLC as described above to confirm the elimination of sialic acids by aldolase 
treatment. 
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WHAT IS CLAIMED IS: 

1 . A cell of interest producing the donor substrate CMP-S A above 
endogenous levels. 

2. A cell of interest producing an acceptor substrate, the donor substrate 
CMP-SA, and expressing the enzyme sialyltransferase; wherein said acceptor 
substrate is a glycan. 



3. The cell of claim 2 wherein said glycan is a branched glycan 
comprising GalGlcNAcMan by at least one branch of said glycan and said Gal is a 
terminal Gal. 

4. The cell of claim 3 wherein said glycan is an asparagine-linked glycan. 



5. A cell of interest producing sialylated glycoprotein above endogenous 

levels. 

6. The cell of claim 5, wherein said glycoprotein is asparagine (N)-linked. 

7. The cell of claim 5, wherein said glycoprotein is heterologous. 



8. The cell of claim 7, wherein said heterologous glycoprotein is 
mammalian. 



9. The cell of claim 5, wherein said mammalian glycoprotein is selected 
from the group consisting of plasminogen, transferrin, Na + ,K + -ATPase, and 
thyrotropin. 

10. The cell of claim 5, wherein said cell expresses at least one enzyme 
selected from the group consisting of: 

a) GlcNAc-2 epimerase; 

b) an enzyme catalyzing conversion of UDP-GlcNAc to ManNAc; 
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c) sialic acid synthetase; 

d) aldolase; 

e) CMP-SA synthetase; 

f) CMP-SA transporter; and 
wherein said expression is above endogenous levels. 

11. The cell of claim 10, wherein said cell expresses enzyme (a). 

12. The cell of claim 1 1 , wherein said enzyme is human. 

13. The cell of claim 10, wherein said cell expresses enzyme (b). 

14. The cell of claim 13, wherein said enzyme is human. 

15. The cell of claim 10, wherein said cell expresses enzyme (c). 

16. The cell of claim 15, wherein said cell expresses the enzyme of SEQ 
IDNO:6. 

17. The cell of claim 1 0, wherein said cell expresses enzyme (d). 

18. The cell of claim 1 7, wherein said cell expresses the enzyme of SEQ 
IDNO:2. 

19. The cell of claim 1 0, wherein said cell expresses enzyme (e). 

20. The cell of claim 1 9, wherein said cell expresses the enzyme of SEQ 
ID NO:4. 

2 1 . The cell of claim 1 0, wherein said cell expresses enzyme (f). 

22. The cell of claim 21 , wherein said enzyme is human. 
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23 . The cell of claim 1 0 wherein said cell further expresses at least one 
enzyme selected from the group consisting of: 



wherein said expression is above endogenous levels. 

24. The cell of claim 1 0, wherein activity of endogenous N- 
acetylglucosaminidase is suppressed. 

25. A kit for expression of sialylated glycoproteins, comprising the cell of 
any of claims 1-24. 

26. A method for manipulating glycoprotein production in an insect cell, 
said method comprising enhancing expression of at least one enzyme selected from 
the group consisting of: 



a) GlcNAc-2 epimerase; 

b) an enzyme catalyzing conversion of UDP-GlcNAc to ManNAc; 

c) sialic acid synthetase; 

d) aldolase; 

e) CMP-SA synthetase; 

f) CMP-SA transporter; and 



wherein the expression of each enzyme expressed is enhanced to above endogenous 
levels. 

27. The method of claim 26, wherein expression of enzyme (a) is 
enhanced. 



0 
ii) 
iii) 
iv) 



GalT; 
GIcNAc TI; 
GlcNAc Til; 
sialyltransferase; and 



28. 



The method of claim 27, wherein said enzyme is human. 
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29. The method of claim 26, wherein expression of enzyme (b) is 
enhanced. 

30. The method of claim 29, wherein said enzyme is human. 

3 1 . The method of claim 26, wherein expression of enzyme (c) is 
enhanced. 

32. The method of claim 3 1 , wherein said enzyme has the sequence of 
SEQIDNO.6. 

33. The method of claim 26, wherein expression of enzyme (d) is 
enhanced. 

34. The method of claim 33, wherein said enzyme has the sequence of 
SEQ ID NO:2. 

35. The method of claim 26, wherein expression of enzyme (e) is 
enhanced. 

36. The method of claim 35, wherein said enzyme has the sequence of 
SEQIDNO:4. 

37. The method of claim 26, wherein expression of enzyme (f) is 
enhanced. 

38. The method of claim 37, wherein said enzyme is human. 

39. The method of claim 26, further comprising enhancing expression of at 
least one enzyme selected from the group consisting of:: 

i) GalT; 

iii) GlcNAc TI; 
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iii) 
iv) 



GlcNAc TII; 
sialyltransferase; and 
expression of each enzyme expressed is enhanced to above 



wherein the 



endogenous levels. 

40. The method of claims 26 or 39, further comprising suppressing activity 
of endogenous N-acetylglucosaminidase. 

41 . A method for producing sialylated glycoproteins, said method 



comprising expressing a heterologous protein in an insect cell manipulated according 
to the method of any of claims 26-40. 

42. The method of claim 41 , wherein said heterologous protein is 
mammalian. 

43. The method of claim 42, wherein said mammalian protein is selected 
from the group plasminogen, transferrin, Na + , K + -ATPase, thyrotropin. 

44. A method for producing a sialylated glycoprotein in a cell of interest 
said method comprising: 



precursor substrates; and 

c) constructing a processing pathway in said cell to produce a 
sialylated glycoprotein. 



a) 
b) 



determining the carbohydrate substrates in said cell; 
transforming said cell with enzymes to produce necessary 



45. The method of claim 44 wherein said cell is selected from the group 
consisting of yeast, insect, fungal, plant, and bacterial cells. 
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Legend 



M: Mannose 
Gl: Glucose 

GN: N-Acetylglucosamine 
F: Fucose 
G: Galactose 

Enzyme-Name 

Lysate % 

Supernatant % 
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ATGGCCTTCCCAAAGAAGAAACTTCAGGGTCTTGTGGCTGCAACCATCACGCCAATGACTGAGAATGGAGAMTCAA 
CTTTTCAGTAATTGGTCAGTATGTGGATTATCTTGTGAAAGAACAGGGAGTGAAGAACATTTTTGTGAATGGCACAA 
CAGGAGAAGGCCTGTCCCTGAGCGTCTCAGAGCGTCGCCAGGTTGCAGAGGAGTGGGTGACAAAAGGGAAGGACAAG 
CTGGATCAGGTGATAATTCACGTAGGAGCACTGAGCTTGAAGGAGTCACAGGAACTGGCCCMCATGCAGCAGAMT 
AGGAGCTGATGGCATCGCTGTCATTGCACCGTTCTTCCTCAAGCCATGGACCAAAGATATCCTGATTAATTTCCTAA 
AGGAAGTGGCTGCTGCCGCCCCTGCCCTGCCATTTTATTACTATCACATTCCTGCCTTGACAGGGGTAAAGATTCGT 
GCTGAGGAGTTGTTGGATGGGATTCTGGATAAGATCCCCACCTTCCAAGGGCTGAAATTCAGTGATACAGATCTCTT 
AGACTTCGGGCAATGTGTTGATCAGAATCGCCAGCAACAGTTTGCTTTCCTTTTTGGGGTGGATGAGCAACTGTTGA 
GTGCTCTGGTGATGGGAGCAACTGGAGCAGTGGGCAGTTTTGTATCCAGAGATTTATCAACTTTGTTGTCAAACTAG 
GTTTTGGAGTGTCACAGACCAAAGCCATCATGACTCTGGTCTCTGGGATTCCMTGGGCCCACCCCGGCTTCCACTG 
CAGAAAGCCTCCAGGGAGTTTACTGATAGTGCTGAAGCTAAACTGAAGAGCCTGGATTTCCTTTCTTTCACTGATTT 
AAAGGATGGAAACTTGGAAGCTGGTAGCTAGTGCCTCTCTATCAAATCAGGGTTTGCACCTTGAGACATAATCTACC 
TTAAATAGTGCATTTTTTTCTCAGGGAATTTTAGATGAACTTGMTAAACTCTCCTAGCAAATGMATCTCACAATA 
AGCATTGAGGTACCTTTTGTGAGCCTTAAAAAGTCTTATTTTGTGAAGGGGCAAAAACTCTAGGAGTCACAACTCTC 
AGTCATTCATTTCACAGATTTTTTTGTGGAGAAATTTCTGTTTATATGGATGAAATGGAATCAAGAGGAAAATTGTA 
ATTGATTAATTCCATCTGTCTTTAGGAGCTCTCATTATCTCGGTCTCTGGTTCCTAATCCTATTTTAAAGTTGTCTA 
ATTTTAAACCACTATAATATGTCTTCATTTTAATAAATATTCATTTGGAATCTAGGAAAACTCTGAGCTACTGCATT 
TAGGCAGGCACTTTAATACCAAACTGTAACATGTCTCAACTGTATACAACTCAAAATACACCAGCTCATTTGGCTGC 
TCAGTCTAACTCTAGAATGGATGCTTTTGAATTCATTTCGATG 

FIG.27 
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MAFPKKKLQGLVAAT I TPMTENGE I NFSV IGQYVDYLVKEQGVKNI FVNGTTGEGLSLSVSERRQVAEEWVTKGKDKLDQ 
V 1 1 HVGALSLKESQE L AQHAAE I GAOG [ AV 1 APFFLKPWTKD I L I NFLKEVAAMPALPFYYYH [ PAL TGVK I RAEELLD 
G I LDK IPTFQGLKFSDTDLLOFGQCVDQNRQQQFAFLFGVOEQLLSALVMGATGAVGSFVSRDLSTLLSN . VLECHRPKP 
S . LWSLGFQWAHPGFHCRKPPGSLL 1 VLKLN .RAWISFLSL I . RMETWKLVASASLSNQGFAPLRHNL 

FIG.28 



ATGGACTCGGTGGAGAAGGGGGCCGCCACCTCCGTCTCCAACCCGCGGGGGCGACCGTCCCGGGGCCGGCCGCCGAAGCT 
GCAGCGCAACTCTCGCGGCGGCCAGGGCCGAGGTGTGGAGAAGCCCCCGCACCTGGCAGCCCTAATTCTGGCCCGGGGAG 
GCAGCAAAGGCATCCCCCTGAAGAACATTAAGCACCTGGCGGGGGTCCCGCTCATTGGCTGGGTCCTGCGTGCGGCCCTG 
GATTCAGGGGCCTTCCAGAGTGTATGGGTTTCGACAGACCATGATGAAATTGAGAATGTGGCCAAACAATTTGGTGCACA 
AGTTCATCGAAGAAGTTCTGAAGTTTCAAAAGACAGCTCTACCTCACTAGATGCCATCATAGAATTTCTTAATTATYATA 
ATGAGGKTGACATTGTAGGAAATATTCAAGCTACTTCTYCATGTTTACATCCTACTGATCTTCAAAAAGTTGCAGAAATG 
ATTCGAGAAGAAGGATATGATTCTGKTTTCTCTGTTGTGAGACGCCATCAGTTTCGATGGAGTGAAATTCAGAAAGGAGT 
TCG TGAAG TG ACCGAACCTCTGAATTTAMTCCAGC T AAACGGCCTCG TCGACAAGACTGGGATGGAGAATTATATGAAA 
ATGGCTCATTTTATTTTGCTAAAAGACATTTGATAGAGATGGGTTACTTGCAGGGTGGAAAATGGCATACTACGAMTGC 
GAGCTGGAACATAGTGTGGATATAGATGTGGATATTGATTGGCCTATTGCAGAGCAAAGAGTATTAAGATATGGCTATTT 
TGGCAAAGAGAAGCTTAAGGAAATAAAACTTTTGGTTTGCAATATTGATGGATGTCTCACCAATGGCCACATTTATGTAT 
CAGGAGACCAAAAAGAAATAATATCTTATGATGTAAAAGATGCTATTGGGATAAGTTTATTAAAGMAAGTGGTATTGAG 
GTGAGGCTAATCTCAGAAAGGGCCTGTTCAAAGCAGACGCTGTCTTCTTTAAAACTGGATTGCAAAATGGAAGTCAGTGT 
ATCAGACAAGCTAGCAGTTGTAGATGAATGGAGAAAAGAAATGGGCCTGTGCTGGAAAGAAGTGGCATATCTTGGAAATG 
AAGTGTCTGATGAAGAGTGCTTGAAGAGAGTGGGCCTAAGTGGCGCTCCTGCTGATGCCTGTTCCTACGCCCAGAAGGCT 
GTTGGATACATTTGCAAATGTAATGGTGGCCGTGGTGCCATCCGAGAATTTGCAGAGCACATTTGCCTACTAATGGAAAA 
AGTTAATAATTCATGCCAAAAATAG 

FIG.29 



MDSVEKGAATSVSNPRGRPSRGRPPKLQRNSRGGQGRGVEKPPHLAAL I LARGGSKG I PLKNI KHLAGVPL IGWVLRAAL 
DSGAFQSVWVSTDHOE I ENVAKQFGAQVHRRSSE VSKOSSTSLDAI I EFLNYXNEXD I VGN IQATSXCLHPTDLQKVAEM 
I REEGYDSXFSVVRRHQFRWSE I QKGVREVTEPLNLNPAKRPRRQDWDGEL YENGSF YFAKRHL I EMGYLQGGKWHTTKC 
ELEHSVD I DVD I DWP I AEQRVLRYGYFGKEKLKE I KLLVCN I DGCLTNGH I YVSGDQKE 1 1 SYDVKDA I G I SLLKKSG I E 
VRLISERACSKQTLSSLKLDCKMEVSVSDKLAVVDEWRKEMGLCWKEVAYLGNEVSDEECLKRVGLSGAPADACSYAQKA 

VGY I CKCNGGRGA 1 REFAEH ICLLMEKVNNSCQK . 

FIG.30 
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ATGCCGCTGGAGCTGGAGCTGTGTCCCGGGCGCTGGGTGGGCGGGCAACACCCGTGCTTCATCATTGCCGAGATCGGCCA 
GAACCACCAGGGCGACCTGGACGTAGCCAAGCGCATGATCCGCATGGCCAAGGAGTGTGGGGCTGATTGTGCCAAGTTCC 
AGAAGAGTGAGCTAGAATTCAAGTTTAATCGGAAAGCCTTGGAGAGGCCATACACCTCGAAGCATTCCTGGGGGAAGACG 
TACGGGGAGCACAAACGACATCTGGAGTTCAGCCATGACCAGTACAGGGAGCTGCAGAGGTACGCCGAGGAGGTTGGGAT 
CTTCTTCACTGCCTCTGGCATGGATGAGATGGCAGTTGAATTCCTGCATGAACTGAATGTTCCATTTTTCAAAGTTGGAT 
CTGGAGACACTAATAATTTTCCTTATCTGGAAAAGACAGCCAAAAAAGGTCGCCCAATGGTGATCTCCAGTGGGATGCAG 
TCAATGGACACCATGAAGCAAGTTTATCAGATCGTGAAGCCCCTCAACCCCAACTTCTGCTTCTTGCAGTGTACCAGCGC 
ATACCCGCTCCAGCCTGAGGACGTCAACCTGCGGGTCATCTCGGAATATCAGAAGCTCTTTCCTGACATTCCCATAGGGT 
ATTCTGGGCATGAAACAGGCATAGCGATATCTGTGGCCGCAGTGGCTCTGGGGGCCAAGGTGTTGGAACGTCACATAACT 
TTGGACAAGACCTGGAAGGGGAGTGACCACTCGGCCTCGCTGGAGCCTGGAGAACTGGCCGAGCTGGTGCGGTCAGTGCG 
TCTTGTGGAGCGTGCCCTGGGCTCCCCAACCAAGCAGCTGCTGCCCTGTGAGATGGCCTGCAATGAGAAGCTGGGCAAGT 
CTGTGGTGGCCAAAGTGAAAATTCCGGAAGGCACCATTCTAACAATGGACATGCTCACCGTGMGGTGGGTGAGCCCAAA 
GCCTATCCTCCTGAAGACATCTTTAATCTAGTGGGCAAGAAGGTCCTGGTCACTGTTGAAGAGGATGACACCATCATGGA 
AGAATTGGTAGATAATCATGGCAAAAAAATCAAGTCTTAA 

FIG.31 



MPLELELCPGRWVGGQHPCF 1 1 AE IGQNHQGDLDVAKRM I RMAKECGADCAKFQKSELEFKFNRKALERPYTSKHSWGKT 
YGEHKRHLEFSHOQYRELQRYAEEVGIFFTASGMDEMAVEFLHELNVPFFKVGSGDTNNFPYLEKTAKKGRPMVISSGMQ 
SMDTMKQVYQ1VKPLNPNFCFLQCTSAYPLQPEDVNLRVISEYQKLFPDIPIGYSGHETGIAI5VAAVALGAKVLERHIT 
LDKTWKGSDHSASLEPGELAELVRSVRLVERALGSPTKQLLPCEMACNEKLGKSVVAKVKIPEGTILTMDMLTVKVGEPK 
AYPPED I FNL VGKKVL VTVEEDDT I MEEL VDNHGKK i KS 
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10 20 30 40 50 60 

1 CGG ACC CAG ACT GGT AGT GCA GGC TTT GGA CCC CGA GCC GCT GCA ATG CCG CTG GAG CTG 60 
1 M P L E L 5 

70 80 90 100 110 120 

61 GAG CTG TGT CCC GGG CGC TGG GTG GGC GGG CAA CAC CCG TGC TTC ATC ATT GCC GAG ATC 120 
6ELCPGRWVGGQHPCF I I A E 125 

130 140 150 160 170 180 

121 GGC CAG AAC CAC CAG GGC GAC CTG GAC GTA GCC AAG CGC ATG ATC CGC ATG GCC AAG GAG 180 
26GQNHQGDLDVAKRM I RMAKE45 

190 200 210 220 230 240 

181 TGT GGG GCT GAT TGT GCC AAG TTC CAG AAG AGT GAG CTA GAA TTC AAG TTT AAT CGG AAA 240 
46 C G A D C A K F Q K S E L E F K F N R K 65 

250 260 270 280 290 300 

241 GCC TTG GAG AGG CCA TAC ACC TCG AAG CAT TCC TGG GGG AAG ACG TAC GGG GAG CAC AAA 300 
66AL E RPYTSKHSWGK TYGE HK85 

310 320 330 340 350 . 360 

301 CGA CAT CTG GAG TTC AGC CAT GAC CAG TAC AGG GAG CTG CAG AGG TAC GCC GAG GAG GTT 360 
86RHL E F SHOQYRE LQRYAE E V105 

370 380 390 400 410 420 

361 GGG ATC TTC TTC ACT GCC TCT GGC ATG GAT GAG ATG GCA GTT GAA TTC CTG CAT GAA CTG 420 
106 G I F F T A S G M 0 E M A V E F L H E L 125 

430 440 450 460 470 480 

421 AAT GTT CCA TTT TTC AAA GTT GGA TCT GGA GAC ACT AAT AAT TTT CCT TAT CTG GAA AAG 480 
126 N V P F F K V G S G D T N N F P Y L E K 145 

490 500 510 520 530 540 

481 ACA GCC AAA AAA GGT CGC CCA ATG GTG ATC TCC AGT GGG ATG CAG TCA ATG GAC ACC ATG 540 
146TAKKGRPMV i SSGMQSMDTM165 
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550 560 570 580 590 600 

541 AAG CAA GTT TAT CAG ATC GTG AAG CCC CTC AAC CCC AAC TTC TGC TTC TTG CAG TGT ACC 600 
166KQVYQ I VKPLNPNFCFLQCT185 

610 620 630 640 650 660 

601 AGC GCA TAC CCG CTC CAG CCT GAG GAC GTC AAC CTG CGG GTC ATC TCG GAA TAT CAG AAG 660 
186 S A Y P L Q P E D V N L R V I S E Y Q K 205 

670 680 690 700 710 720 

661 CTC TTT CCT GAC ATT CCC ATA GGG TAT TCT GGG CAT GAA ACA GGC ATA GCG ATA TCT GTG 720 
206 L F P D I P I G Y S G H E T G I A I S V 225 

730 740 750 760 770 780 

721 GCC GCA GTG GCT CTG GGG GCC AAG GTG TTG GAA CGT CAC ATA ACT TTG GAC AAG ACC TGG 780 
226 A A V A L G A K V L E R H I T L D K T W 245 

790 800 810 820 830 840 

781 AAG GGG AGT GAC CAC TCG GCC TCG CTG GAG CCT GGA GAA CTG GCC GAG CTG GTG CGG TCA 840 
246 K G S D H S A S L E P G E L A E L V R S 265 

850 860 870 880 890 900 

841 GTG CGT CTT GTG GAG CGT GCC CTG GGC TCC CCA ACC AAG CAG CTG CTG CCC TGT GAG ATG 900 
266 V R L V E R A L G S P T K Q L L P C E M 285 

910 920 930 940 950 960 

901 GCC TGC AAT GAG AAG CTG GGC AAG TCT GTG GTG GCC AAA GTG AAA ATT CCG GAA GGC ACC 960 
286 ACNEKLGKSVVAKVKI PEGT305 

970 980 990 1000 1010 1020 

961 ATT CTA ACA ATG GAC ATG CTC ACC GTG AAG GTG GGT GAG CCC AAA GCC TAT CCT CCT GAA 1020 
306 I LTMDMLTVKVGEPKAYPPE325 

1030 1040 1050 1060 1070 1080 

1021 GAC ATC TTT AAT CTA GTG GGC AAG AAG GTC CTG GTC ACT GTT GAA GAG GAT GAC ACC ATC 1080 
326 D I F N L V G K K V L V T V E E D D T I 345 
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1090 



1100 



1110 



1120 



1130 



1140 



1081 ATG GAA GAA TTG GTA GAT AAT CAT GGC AAA AAA ATC AAG TCT TAA AAA TAA AGT GCC ATT 1140 
346 MEELVDNHGKK IKS* 359 



1141 CTC TGA 1146 
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1 MPLELELCPGRWVGGQHPCFI IAEIGQNHQGDLDVAKRMIRMAKECGADCAKFQKSELEF 

I I Mil II I II II III I III 

1 MS NIYIVAEIGCNHNGSVDIAREMILKAKEAGVNAVKFQTFKADK 

61 KFNRKALERPYTSKHSWG-KTYGEHKRHLEFSHDQYRELQRYAEEVGI FFTASGMDEMAV 

III I II I I I II M 

46 LISAIAPKAEYQIKNTGELESQLEMTKKLEMKYDDYLHLMEYAVSLNLDVFSTPFDEDSI 

120 EFLHELNVPFFKVGSGDTNNFPYLEKTAK---KGRPMVISSGMQSMDTMKQ---VYQIVK 

III I II I Mill II II II I II I 

106 DFLASLKQKIWKIPSGELLNLPYLEKIAKLPIPDKKIIISTGMATIDEIKQSVSIFINNK 

174 PLNPNFCFLQCTSAYPLQPEDVNLRVISEYQKLFPDIPIGYSGHETGIAISVAAVALGAK 

i m ii inn i i n ii i i i in i 

166 VPVGNITILHCNTEYPTPFEDVNLNAINDLKKHFPKNNIGFSDHSSGFYAAIAAVPYGIT 

234 VLERHITLDKTWKGSDHSASLEPGELAELVRSVRLVERALGSPTKQLLPCEMACNEKLGK 

I I llll I II II II II I II II III "I I I 
226 FIEKHFTLDKSMSGPDHLASIEPDELKHLCIGVRCVEKSLGSNSKVVTASERKNKIVARK 

294 SVVAKVKIPEGTILTMDMLTVKVGEPKAYPPEDIFNLVGKKVLVTVEEDDTIMEELVDNH 

I -II I I II I II II I I I II 

286 SIIAKTEIKKGEVFSEKNITTKRP-GNGISPMEWYNLLGK IAEQDFIPDELIIHS 

354 G-KKIKS 
I 

340 EFKNQGE 
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SEQUENCE LISTING 
<110> Human Genome Sciences, Inc. 

<120> Engineering Intracellular Sialylation Pathways 

<13 0> PF509.PCT 

<140> Unassigned 
<141> 2000-03-01 

<150> 60/122,582 
<151> 1999-12-07 

<150> 60/169,624 
<151> 1999-12-08 

<160> 8 

<170> Patentln Ver . 2.1 

<210> 1 

<211> 1429 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1) . . (693) 

<400> 1 

atg gcc ttc cca aag aag aaa ctt cag ggt ctt gtg get gca acc ate 48 
Met Ala Phe Pro Lys Lys Lys Leu Gin Gly Leu Val Ala Ala Thr lie 
15 10 15 

acg cca atg act gag aat gga gaa ate aac ttt tea gta att ggt cag 96 
Thr Pro Met Thr Glu Asn Gly Glu He Asn Phe Ser Val He Gly Gin 
20 25 30 

tat gtg gat tat ctt gtg aaa gaa cag gga gtg aag aac att ttt gtg 144 
Tyr Val Asp Tyr Leu Val Lys Glu Gin Gly Val Lys Asn He Phe Val 
35 40' 45 

aat ggc aca aca gga gaa ggc ctg tec ctg age gtc tea gag cgt cgc 192 
Asn Gly Thr Thr Gly Glu Gly Leu Ser Leu Ser Val Ser Glu Arg Arg 
50 55 60 

cag gtt gca gag gag tgg gtg aca aaa ggg aag gac aag ctg gat cag 240 
Gin Val Ala Glu Glu Trp Val Thr Lys Gly Lys Asp Lys Leu Asp Gin 
65 70 75 80 

gtg ata att cac gta gga gca ctg age ttg aag gag tea cag gaa ctg 288 
Val He He His Val Gly Ala Leu Ser Leu Lys Glu Ser Gin Glu Leu 
85 90 95 



gcc caa cat gca gca gaa ata gga get gat ggc ate get gtc att gca 
Ala Gin His Ala Ala Glu He Gly Ala Asp Gly He Ala Val He Ala 
100 105 110 



336 
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ccg ttc ttc etc aag cca tgg acc aaa gat ate ctg att aat ttc eta 3 84 
Pro Phe Phe Leu Lys Pro Trp Thr Lys Asp lie Leu lie Asn Phe Leu 
115 120 125 

aag gaa gtg get- get gee gee cct gee ctg cca ttt tat tac tat cac 432 
Lys Glu Val Ala Ala Ala Ala Pro Ala Leu Pro Phe Tyr Tyr Tyr His 
130 135 • 140 

att cct gec ttg aca ggg gta aag att cgt get gag gag ttg ttg gat 480 
He Pro Ala Leu Thr Gly Val Lys He Arg Ala Glu Glu Leu Leu Asp 
145 150 155 160 

ggg att ctg gat aag ate ccc acc ttc caa ggg ctg aaa ttc agt gat 528 
Gly He Leu Asp Lys He Pro Thr Phe Gin Gly Leu Lys Phe Ser Asp 
165 170 175 

aca gat etc tta gac ttc ggg caa tgt gtt gat cag aat cgc cag caa 576 
Thr Asp Leu Leu Asp Phe Gly Gin Cys Val Asp Gin Asn Arg Gin Gin 
180 185 190 

cag ttt get ttc ctt ttt ggg gtg gat gag caa ctg ttg agt get ctg 624 
Gin Phe Ala Phe Leu Phe Gly Val Asp Glu Gin Leu Leu Ser Ala Leu 
195 200 205 

gtg atg gga gca act gga gca gtg ggc agt ttt gta tec aga gat tta 672 
Val Met Gly Ala Thr Gly Ala Val Gly Ser Phe Val Ser Arg Asp Leu 
210 215 220 

tea act ttg ttg tea aac tag gttttggagt gtcacagacc aaagecatea 723 
Ser Thr Leu Leu Ser Asn 
225 230 

tgactctggt ctctgggatt ccaatgggcc caccccggct tccactgcag aaagcctcca 783 
gggagtttac tgatagtgct gaagctaaac tgaagagect ggatttcctt tctttcactg 843 
atttaaagga tggaaacttg gaagctggta getagtgect ctctatcaaa tcagggtttg 903 
caccttgaga cataatctac cttaaatagt gcattttttt ctcagggaat tttagatgaa 963 
cttgaataaa ctctcctagc aaatgaaatc tcacaataag cattgaggta ccttttgtga 1023 
gecttaaaaa gtcttatttt gtgaaggggc aaaaactcta ggagtcacaa ctctcagtca 1083 
ttcatttcac agattttttt gtggagaaat ttctgtttat atggatgaaa tggaatcaag 1143 
aggaaaattg taattgatta attccatctg tctttaggag ctctcattat ctcggtctct 1203 
ggttcctaat cctattttaa agttgtctaa ttttaaacca ctataatatg tcttcatttt 1263 
aataaatatt catttggaat ctaggaaaac tctgagctac tgcatttagg caggcacttt 1323 
aataccaaac tgtaacatgt ctcaactgta tacaactcaa aatacaccag ctcatttggc 13 83 
tgetcagtet aactctagaa tggatgcttt tgaattcatt tcgatg 1429 



<210> 2 
<211> 230 
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<212> PRT 

<213> Homo sapiens 



<400> 2 



Met 


Ala 


Phe 


Pro 


Lys 


Lys Lys Leu 


1 








5 




Thr 


Pro 


Met 


Thr 


Glu 


Asn Gly Glu 








20 






Tyr 


Val 


Asp 


Tyr 


Leu 


Val Lys Glu 






35 






40 


Asn Gly 


Thr 


Thr 


Gly 


Glu Gly Leu 




50 








55 


Gin 


Val 


Ala 


Glu 


Glu 


Trp Val Thr 


65 










70 


Val 


He 


He 


His 


Val 


Gly Ala Leu 










85 




Ala 


Gin 


His 


Ala 


Ala 


Glu He Gly 








100 






Pro 


Phe 


Phe 


Leu 


Lys 


Pro Trp Thr 






115 






120 


Lys 


Glu 


Val 


Ala 


Ala 


Ala Ala Pro 




130 








135 


He 


Pro 


Ala 


Leu 


Thr 


Gly Val Lys 


145 










150 


Gly 


He 


Lea 


Asp 


Lys 


He Pro Thr 










165 




Thr 


Asp 


Leu 


Leu 


Asp 


Phe Gly Gin 








180 






Gin 


Phe 


Ala 


Phe 


Leu 


Phe Gly Val 






195 






200 


Val 


Met 


Gly Ala 


Thr 


Gly Ala Val 




210 








215 


Ser 


Thr 


Leu 


Leu 


Ser 


Asn 


225 










230 



Gin Gly 


Leu Val 


Ala 


Ala 


Thr 


He 




10 










15 




lie 


Asn 


Phe 


Ser 


Val 


He 


Gly Gin 


25 










30 






Gin Gly 


Val 


Lys Asn 


He 


Phe 


Val 










45 








Ser 


Leu 


Ser 


Val 
60 


Ser 


Glu 


Arg 


Arg 


Lys 


Gly 


Lys 
75 


Asp 


Lys 


Leu 


Asp 


Gin 

80 


Ser 


Leu 
90 


Lys 


Glu 


Ser 


Gin 


Glu 
95 


Leu 


Ala 


Asp 


Gly He Ala Val 


He 


Ala 


105 










110 






Lys 


Asp 


He 


Leu 


He 
125 


Asn 


Phe 


Leu 


Ala 


Leu 


Pro 


Phe 
140 


Tyr 


Tyr 


Tyr 


His 


He 


Arg 


Ala 
155 


Glu 


Glu 


Leu 


Leu 


Asp 
160 


Phe 


Gin 


Gly Leu Lys 


Phe 


Ser Asp 




170 










175 




Cys 


Val 


Asp 


Gin 


Asn 


Arg 


Gin 


Gin 


185 










190 






Asp 


Glu 


Gin 


Leu 


Leu 
205 


Ser 


Ala 


Leu 


Gly Ser 


Phe 


val 


Ser 


Arg 


Asp 


Leu 



220 



<210> 3 

<211> 1305 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1) . . (1305) 

<400> 3 

atg gac teg gtg gag aag ggg gec 
Met Asp Ser Val Glu Lys Gly Ala 
1 5 

ggg cga ccg tec egg ggc egg ccg 
Gly Arg Pro Ser Arg Gly Arg Pro 

20 



gec acc tec gtc tec aac ccg egg 48 
Ala Thr Ser Val Ser Asn Pro Arg 
10 15 

ccg aag ctg cag cgc aac tct cgc 96 
Pro Lys Leu Gin Arg Asn Ser Arg 
25 30 



ggc ggc cag ggc cga ggt gtg gag aag ccc ccg cac ctg gca gec eta 144 
Gly Gly Gin Gly Arg Gly Val Glu Lys Pro Pro His Leu Ala Ala Leu 
35 40 45 
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att ctg gcc egg gga ggc age aaa ggc ate ccc ctg aag aac att aag 192 
lie Leu Ala Arg Gly Gly Ser Lys Gly lie Pro Leu Lys Asn He Lys 
50 55 60 

cac ctg gcg ggg ghc ccg etc att ggc tgg gtc ctg cgt gcg gcc ctg 240 
His Leu Ala Gly Val Pro Leu He Gly Trp Val Leu Arg Ala Ala Leu 
65 70 75 80 

gat tea ggg gcc ttc cag agt gta tgg gtt teg aca gac cat gat gaa 28 8 
Asp Ser Gly Ala Phe Gin Ser Val Trp Val Ser Thr Asp His Asp Glu 
85 90 95 

att gag aat gtg gcc aaa caa ttt ggt gca caa gtt cat cga aga agt 336 
He Glu Asn Val Ala Lys Gin Phe Gly Ala Gin Val His Arg Arg Ser 
100 105 110 

tct gaa gtt tea aaa gac age tct ace tea eta gat gcc ate ata gaa 384 
Ser Glu Val Ser Lys Asp Ser Ser Thr Ser Leu Asp Ala He He Glu 
115 120 125 

ttt ctt aat tat yat aat gag gkt gac att gta gga aat att caa get 432 
Phe Leu Asn Tyr Xaa Asn Glu Xaa Asp He Val Gly Asn He Gin Ala 
130 135 140 

act tct yea tgt tta cat cct act gat ctt caa aaa gtt gca gaa atg 480 
Thr Ser Xaa Cys Leu His Pro Thr Asp Leu Gin Lys Val Ala Glu Met 
145 150 155 160 

att cga gaa gaa gga tat gat tct gkt ttc tct gtt gtg aga cgc cat 528 
He Arg Glu Glu Gly Tyr Asp Ser Xaa Phe Ser Val Val Arg Arg His 
165 170 175 

cag ttt cga tgg agt gaa att cag aaa gga gtt cgt gaa gtg acc gaa 576 
Gin Phe Arg Trp Ser Glu He Gin Lys Gly Val Arg Glu Val Thr Glu 
180 185 190 

cct ctg aat tta aat cca get aaa egg cct cgt cga caa gac tgg gat 624 
Pro Leu Asn Leu Asn Pro Ala Lys Arg Pro Arg Arg Gin Asp Trp Asp 
195 200 205 

gga gaa tta tat gaa aat ggc tea ttt tat ttt get aaa aga cat ttg 672 
Gly Glu Leu Tyr Glu Asn Gly Ser Phe Tyr Phe Ala Lys Arg His Leu 
210 215 220 

ata gag atg ggt tac ttg cag ggt gga aaa tgg cat act acg aaa tgc 720 
He Glu Met Gly Tyr Leu Gin Gly Gly Lys Trp His Thr Thr Lys Cys 
225 230 235 240 

gag ctg gaa cat agt gtg gat ata gat gtg gat att gat tgg cct att 768 
Glu Leu Glu His Ser Val Asp He Asp Val Asp He Asp Trp Pro He 
245 250 255 

gca gag caa aga gta tta aga tat ggc tat ttt ggc aaa gag aag ctt 816 
Ala Glu Gin Arg Val Leu Arg Tyr Gly Tyr Phe Gly Lys Glu Lys Leu 
260 265 270 



aag gaa ata aaa ctt ttg gtt tgc aat att gat gga tgt etc acc aat 
Lys Glu He Lys Leu Leu Val Cys Asn He Asp Gly Cys Leu Thr Asn 
275 280 285 



864 
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ggc cac att tat gta tea gga gac caa aaa gaa ata ata tct tat gat 912 
Gly His He Tyr Val Ser Gly Asp Gin Lys Glu He He Ser Tyr Asp 
290 295 300 

gta aaa gat get att ggg ata agt tta tta aag aaa agt ggt att gag 960 
Val Lys Asp Ala He Gly He Ser Leu Leu Lys Lys Ser Gly He Glu 
305 310 315 320 

gtg agg eta ate tea gaa agg gee tgt tea aag cag acg ctg tct tct 1008 
Val Arg Leu He Ser Glu Arg Ala Cys Ser Lys Gin Thr Leu Ser Ser 
325 330 335 

tta aaa ctg gat tgc aaa atg gaa gtc agt gta tea gac aag eta gca 1056 
Leu Lys Leu Asp Cys Lys Met Glu Val Ser Val Ser Asp Lys Leu Ala 
340 345 350 

gtt gta gat gaa tgg aga aaa gaa atg ggc ctg tgc tgg aaa gaa gtg 1104 
Val Val Asp Glu Trp Arg Lys Glu Met Gly Leu Cys Trp Lys Glu Val 
355 360 365 

gca tat ctt gga aat gaa gtg tct gat gaa gag tgc ttg aag aga gtg 1152 
Ala Tyr Leu Gly Asn Glu Val Ser Asp Glu Glu Cys Leu Lys Arg Val 
370 375 380 

ggc eta agt ggc get cct get gat gee tgt tec tac gee cag aag get 1200 
Gly Leu Ser Gly Ala Pro Ala Asp Ala Cys Ser Tyr Ala Gin Lys Ala 
385 390 395 400 

gtt gga tac att tgc aaa tgt aat ggt ggc cgt ggt gec ate cga gaa 1248 
Val Gly Tyr He Cys Lys Cys Asn Gly Gly Arg Gly Ala He Arg Glu 
405 410 415 

ttt gca gag cac att tgc eta eta atg gaa aaa gtt aat aat tea tgc 129 5 
Phe Ala Glu His He Cys Leu Leu Met Glu Lys Val Asn Asn Ser Cys 
420 425 430 

caa aaa tag 13 05 

Gin Lys 

435 



<210> 4 

<211> 434 

<212> PRT 

<213> Homo sapiens 

<400> 4 

Met Asp Ser Val Glu Lys Gly Ala Ala Thr Ser Val Ser Asn Pro Arg 

15 10 15 

Gly Arg Pro Ser Arg Gly Arg Pro Pro Lys Leu Gin Arg Asn Ser Arg 

20 25 30 

Gly Gly Gin Gly Arg Gly Val Glu Lys Pro Pro His Leu Ala Ala Leu 

35 40 45 

He Leu Ala Arg Gly Gly Ser Lys Gly He Pro Leu Lys Asn He Lys 

50 55 60 

His Leu Ala Gly Val Pro Leu He Gly Trp Val Leu Arg Ala Ala Leu 
65 70 75 80 

Asp Ser Gly Ala Phe Gin Ser Val Trp Val Ser Thr Asp His Asp Glu 
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85 



He 


Glu 


Asn 


Val 


Ala 


Lys Gin Phe 








100 






Ser 


Glu 


Val 


Ser 


Lys 


Asp Ser Ser 






115 






120 


Phe 


Leu 


Asn 


Tyr 


Xaa 


Asn Glu Xaa 




130 








135 


Thr 


Ser 


Xaa 


Cys 


Leu 


His Pro Thr 


145 










150 


He 


Arg 


Glu 


Glu 


Gly 


Tyr Asp Ser 










165 




Gin 


Phe 


Arg 


Trp 


Ser 


Glu He Gin 








180 






Pro 


Leu 


Asn 


Leu 


Asn 


Pro Ala Lys 






195 






200 


Gly Glu 


Leu 


Tyr 


Glu 


Asn Gly Ser 




210 








215 


He 


Glu 


Met 


Gly 


Tyr 


Leu Gin Gly 


225 










230 


Glu 


Leu 


Glu 


His 


Ser 


Val Asp He 










245 




Ala 


Glu 


Gin 


Arg 


Val 


Leu Arg Tyr 








260 






Lys 


Glu 


He 


Lys 


Leu 


Leu Val Cys 






275 






280 


Gly His 


He 


Tyr 


Val 


Ser Gly Asp 




290 








295 


Val 


Lys 


Asp 


Ala 


He 


Gly He Ser 


305 










310 


Val 


Arg 


Leu 


He 


Ser 


Glu Arg Ala 










325 




Leu 


Lys 


Leu 


Asp 


Cys 


Lys Met Glu 








340 






Val 


Val 


Asp 


Glu 


Trp 


Arg Lys Glu 






355 






360 


Ala 


Tyr 


Leu 


Gly 


Asn 


Glu Val Ser 




370 








375 


Gly Leu 


Ser 


Gly 


Ala 


Pro Ala Asp 


385 










390 


Val 


Gly 


Tyr 


He 


Cys 


Lys Cys Asn 










405 




Phe 


Ala 


Glu 


His 


He 


Cys Leu Leu 








420 






Gin 


Lys 













90 




95 




Gly 


Ala Gin Val His 


Arg Arg 


Ser 


105 




110 






Thr 


Ser Leu Asp Ala 


He 


He 


Glu 




125 








Asp 


He Val Gly Asn 


He 


Gin 


Ala 




140 








Asp 


Leu Gin Lys Val 


Ala 


Glu 


Met 




155 






160 


Xaa 


Phe Ser Val Val 


Arg Arg 


His 




170 




175 




Lys 


Gly Val Arg Glu 


Val 


Thr 


Glu 


185 




190 






Arg 


Pro Arg Arg Gin 


Asp Trp 


Asp 




205 








Phe 


Tyr Phe Ala Lys 


Arg His 


Leu 




220 








Gly 


Lys Trp His Thr 


Thr 


Lys 


Cys 




235 






240 


Asp 


Val Asp He Asp 


Trp 


Pro 


He 




250 




255 




Gly 


Tyr Phe Gly Lys 


Glu 


Lys 


Leu 


265 




270 






Asn 


He Asp Gly Cys 


Leu 


Thr 


Asn 




285 








Gin 


Lys Glu He He 


Ser 


Tyr 


Asp 




300 








Leu 


Leu Lys Lys Ser 


Gly He 


Glu 




315 






320 


Cys 


Ser Lys Gin Thr 


Leu 


Ser 


Ser 




330 




335 




Val 


Ser Val Ser Asp 


Lys 


Leu 


Ala 


345 




350 






Met 


Gly Leu Cys Trp 


Lys 


Glu 


Val 




365 








Asp 


Glu Glu Cys Leu 


Lys 


Arg 


Val 




380 








Ala 


Cys Ser Tyr Ala 


Gin 


Lys 


Ala 




395 






400 


Gly 


Gly Arg Gly Ala 


He 


Arg 


Glu 




410 




415 




Met 


Glu Lys Val Asn 


Asn 


Ser 


Cys 


425 




430 







<210> 5 

<211> 1080 

<212> DMA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1) . . (1080) 

<400> 5 

atg ccg ctg gag ctg gag ctg tgt ccc ggg cgc tgg gtg ggc ggg caa 48 
Met Pro Leu Glu Leu Glu Leu Cys Pro Gly Arg Trp Val Gly Gly Gin 
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cac ccg tgc ttc ate att gec gag ate ggc cag aac cac cag ggc gac 96 
His Pro Cys Phe lie lie Ala Glu lie Gly Gin Asn His Gin Gly Asp 
20 25 30 

ctg gac gta gec aag cgc atg ate cgc atg gec aag gag tgt ggg get 144 
Leu Asp Val Ala Lys Arg Met lie Arg Met Ala Lys Glu Cys Gly Ala 
35 40 45 

gat tgt gec aag ttc cag aag agt gag eta gaa ttc aag ttt aat egg 192 
Asp Cys Ala Lys Phe Gin Lys Ser Glu Leu Glu Phe Lys Phe Asn Arg 
50 55 60 

aaa gec ttg gag agg cca tac acc teg aag cat tec tgg ggg aag acg 240 
Lys Ala Leu Glu Arg Pro Tyr Thr Ser Lys His Ser Trp Gly Lys Thr 
65 70 75 80 

tac ggg gag cac aaa cga cat ctg gag ttc age cat gac cag tac agg 2 88 
Tyr Gly Glu His Lys Arg His Leu Glu Phe Ser His Asp Gin Tyr Arg 
85 90 95 

gag ctg cag agg tac gee gag gag gtt ggg ate ttc ttc act gec tct 33 6 
Glu Leu Gin Arg Tyr Ala Glu Glu Val Gly He Phe Phe Thr Ala Ser 
100 105 110 

ggc atg gat gag atg gca gtt gaa ttc ctg cat gaa ctg aat gtt cca 3 84 
Gly Met Asp Glu Met Ala Val Glu Phe Leu His Glu Leu Asn Val Pro 
115 120 125 

ttt ttc aaa gtt gga tct gga gac act aat aat ttt cct tat ctg gaa 432 
Phe Phe Lys Val Gly Ser Gly Asp Thr Asn Asn Phe Pro Tyr Leu Glu 
130 135 140 

aag aca gec aaa aaa ggt cgc cca atg gtg ate tec agt ggg atg cag 480 
Lys Thr Ala Lys Lys Gly Arg Pro Met Val He Ser Ser Gly Met Gin 
145 150 155 160 

tea atg gac acc atg aag caa gtt tat cag ate gtg aag ccc etc aac 52 8 
Ser Met Asp Thr Met Lys Gin Val Tyr Gin He Val Lys Pro Leu Asn 
165 170 175 

ccc aac ttc tgc ttc ttg cag tgt acc age gca tac ccg etc cag cct 576 
Pro Asn Phe Cys Phe Leu Gin Cys Thr Ser Ala Tyr Pro Leu Gin Pro 
180 185 190 

gag gac gtc aac ctg egg gtc ate teg gaa tat cag aag etc ttt cct 624 
Glu Asp Val Asn Leu Arg Val He Ser Glu Tyr Gin Lys Leu Phe Pro 
195 200 205 

gac att ccc ata ggg tat tct ggg cat gaa aca ggc ata gcg ata tct 672 
Asp He Pro He Gly Tyr Ser Gly His Glu Thr Gly He Ala He Ser 
210 215 220 

gtg gec gca gtg get ctg ggg gec aag gtg ttg gaa cgt cac ata act 720 
Val Ala Ala Val Ala Leu Gly Ala Lys Val Leu Glu Arg His He Thr 
225 230 235 240 

ttg gac aag acc tgg aag ggg agt gac cac teg gec teg ctg gag cct 768 
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Leu Asp Lys Thr Trp Lys Gly Ser Asp His Ser Ala Ser Leu Glu Pro 
245 250 255 

gga gaa ctg gcc gag ctg gtg egg tea gtg cgt ct't gtg gag cgt gec 816 
Gly Glu Leu Ala Glu Leu Val Arg Ser Val Arg Leu Val Glu Arg Ala 
260 265 270 

ctg ggc tec cca ace aag cag ctg ctg ccc tgt gag atg gcc tgc aat 864 
Leu Gly Ser Pro Thr Lys Gin Leu Leu Pro Cys Glu Met Ala Cys Asn 
275 280 285 

gag aag ctg ggc aag tct gtg gtg gcc aaa gtg aaa att ccg gaa ggc 912 
Glu Lys Leu Gly Lys Ser Val Val Ala Lys Val Lys lie Pro Glu Gly 
290 295 300 

acc att eta aca atg gac atg etc acc gtg aag gtg ggt gag ccc aaa 960 
Thr lie Leu Thr Met Asp Met Leu Thr Val Lys Val Gly Glu Pro Lys 
305 310 315 320 

gcc tat cct cct gaa gac ate ttt aat eta gtg ggc aag aag gtc ctg 1008 
Ala Tyr Pro Pro Glu Asp lie Phe Asn Leu Val Gly Lys Lys Val Leu 
325 330 335 

gtc act gtt gaa gag gat gac acc ate atg gaa gaa ttg gta gat aat 1056 
Val Thr Val Glu Glu Asp Asp Thr lie Met Glu Glu Leu Val Asp Asn 
340 345 350 



cat ggc aaa aaa ate aag tct taa 10 80 

His Gly Lys Lys He Lys Ser 

355 360 



<210> 6 
<211> 359 
<212> PRT 

<213> Homo sapiens 
<400> 6 

Met Pro Leu Glu Leu Glu Leu Cys Pro Gly Arg Trp Val Gly Gly Gin 

1 5 10 15 

His Pro Cys Phe He He Ala Glu He Gly Gin Asn His Gin Gly Asp 

20 25 30 

Leu Asp Val Ala Lys Arg Met He Arg Met Ala Lys Glu Cys Gly Ala 

35 40 45 

Asp Cys Ala Lys Phe Gin Lys Ser Glu Leu Glu Phe Lys Phe Asn Arg 

50 55 60 

Lys Ala Leu Glu Arg Pro Tyr Thr Ser Lys His Ser Trp Gly Lys Thr 
65 70 75 80 

Tyr Gly Glu His Lys Arg His Leu Glu Phe Ser His Asp Gin Tyr Arg 

85 90 95 

Glu Leu Gin Arg Tyr Ala Glu Glu Val Gly He Phe Phe Thr Ala Ser 

100 105 110 

Gly Met Asp Glu Met Ala Val Glu Phe Leu His Glu Leu Asn Val Pro 

115 12 0 125 

Phe Phe Lys Val Gly Ser Gly Asp Thr Asn Asn Phe Pro Tyr Leu Glu 

130 135 140 

Lys Thr Ala Lys Lys Gly Arg Pro Met Val He Ser Ser Gly Met Gin 
145 150 155 160 

Ser Met Asp Thr Met Lys Gin Val Tyr Gin He Val Lys Pro Leu Asn 
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165 



Pro 


Asn 


Phe 


Cys 
180 


Phe 


Leu 


Gin 


Cys 


Glu 


Asp 


Val 
195 


Asn 


Leu 


Arg 


Val 


He 
200 


Asp 


He 
210 


Pro 


He 


Gly 


Tyr 


Ser 
215 


Gly 


Val 


Ala 


Ala 


Val 


Ala 


Leu 


Gly Ala 


225 










230 






Leu 


Asp 


Lys 


Thr 


Trp 


Lys 


Gly Ser 










245 








Gly Glu 


Leu 


Ala 


Glu 


Leu 


Val 


Arg 








260 










Leu Gly 


Ser 


Pro 


Thr 


Lys 


Gin 


Leu 






275 










280 


Glu 


Lys 
290 


Leu 


Gly 


Lys 


Ser 


Val 
295 


Val 


Thr 


He 


Leu 


Thr 


Met 


Asp 


Met 


Leu 


305 










310 






Ala 


Tyr 


Pro 


Pro 


Glu 
325 


Asp 


He 


Phe 


Val 


Thr 


Val 


Glu 
340 


Glu 


Asp 


Asp 


Thr 


His 


Gly 


Lys 


Lys 


He 


Lys 


Ser 





355 





170 










175 




Thr 


Ser 


Ala 


Tyr 


Pro 


Leu 


Gin 


Pro 


185 










190 






Ser 


Glu 


Tyr 


Gin 


Lys 


Leu 


Phe 


Pro 










205 








His 


Glu 


Thr 


Gly 


He 


Ala 


He 


Ser 








220 










Lys 


Val 


Leu 


Glu 


Arg 


His 


He 


Thr 






235 










240 


Asp 


His 


Ser 


Ala 


Ser 


Leu 


Glu 


Pro 




250 










255 




Ser 


Val 


Arg 


Leu 


Val 


Glu 


Arg 


Ala 


265 










270 






Leu 


Pro 


Cys 


Glu 


Met 


Ala 


Cys 


Asn 










285 








Ala 


Lys 


Val 


Lys 


He 


Pro 


Glu 


Gly 








300 










Thr 


Val 


Lys Val Gly 


Glu 


Pro 


Lys 






315 










320 


Asn 


Leu 


Val 


Gly Lys 


Lys 


Val 


Leu 




330 










335 




He 


Met 


Glu 


Glu 


Leu 


Val 


Asp 


Asn 


345 










350 







<210> 7 

<211> 1059 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1) . . (1041) 

<400> 7 

atg agt aat ata tat ate gtt get gaa att ggt tgc aac cat aat ggt 48 
Met Ser Asn He Tyr He Val Ala Glu He Gly Cys Asn His Asn Gly 
15 10 15 

agt gtt gat att gca aga gaa atg ata tta aaa gec aaa gag gec ggt 96 
Ser Val Asp He Ala Arg Glu Met He Leu Lys Ala Lys Glu Ala Gly 

20 " 25 30 

gtt aat gca gta aaa ttc caa aca ttt aaa get gat aaa tta att tea 144 
Val Asn Ala Val Lys Phe Gin Thr Phe Lys Ala Asp Lys Leu He Ser 
35 40 45 

get att gca cct aag gca gag tat caa ata aaa aac aca gga gaa tta 192 
Ala He Ala Pro Lys Ala Glu Tyr Gin He Lys Asn Thr Gly Glu Leu 
50 55 60 

gaa tct cag tta gaa atg aca aaa aag ctt gaa atg aag tat gac gat 240 
Glu Ser Gin Leu Glu Met Thr Lys Lys Leu Glu Met Lys Tyr Asp Asp 
65 70 75 80 



tat etc cat eta atg gaa tat gca gtc agt tta aat tta gat gtt ttt 



288 
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Tyr Leu His Leu Met Glu Tyr Ala Val Ser Leu Asn Leu Asp Val Phe 
85 90 95 

tct acc cct ttt gac gaa gac tct att gat ttt tta gca tct ttg aaa 336 
Ser Thr Pro Phe Asp Glu Asp Ser lie Asp Phe Leu Ala Ser Leu Lys 
100 105 110 

caa aaa ata tgg aaa ate cct tea ggt gag tta ttg aat tta ccg tat 384 
Gin Lys lie Trp Lys lie Pro Ser Gly Glu Leu Leu Asn Leu Pro Tyr 
115 120 125 

ctt gaa aaa ata gec aag ctt ccg ate cct gat aag aaa ata ate ata 432 
Leu Glu Lys He Ala Lys Leu Pro He Pro Asp Lys Lys He He He 
130 135 140 

tea aca gga atg get act att gat gag ata aaa cag tct gtt tct att 480 
Ser Thr Gly Met Ala Thr He Asp Glu He Lys Gin Ser Val Ser He 
145 150 155 160 

ttt ata aat aat aaa gtt ccg gtt ggt aat att aca ata tta cat tgc 528 
Phe He Asn Asn Lys Val Pro Val Gly Asn He Thr He Leu His Cys 
165 170 175 

aat act gaa tat cca acg ccc ttt gag gat gta aac ctt aat get att 576 
Asn Thr Glu Tyr Pro Thr Pro Phe Glu Asp Val Asn Leu Asn Ala He 
180 185 190 

aat gat ttg aaa aaa cac ttc cct aag aat aac ata ggc ttc tct gat 624 
Asn Asp Leu Lys Lys His Phe Pro Lys Asn Asn He Gly Phe Ser Asp 
195 ' 200 205 

cat tct age ggg ttt tat gca get att gcg gcg gtg cct tat gga ata 672 
His Ser Ser Gly Phe Tyr Ala Ala He Ala Ala Val Pro Tyr Gly He 
210 215 220 

act ttt att gaa aaa cat ttc act tta gat aaa tct, atg tct ggc cca 720 
Thr Phe He Glu Lys His Phe Thr Leu Asp Lys Ser Met Ser Gly Pro 
225 230 235 240 

gat cat ttg gee tea ata gaa cct gat gaa ctg aaa cat ctt tgt att 768 
Asp His Leu Ala Ser He Glu Pro Asp Glu Leu Lys His Leu Cys He 
245 250 255 

ggg gtc agg tgt gtt gaa aaa tct tta ggt tea aat agt aaa gtg gtt 816 
Gly Val Arg Cys Val Glu Lys Ser Leu Gly Ser Asn Ser Lys Val Val 
260 265 270 

aca get tea gaa agg aag aat aaa ate gta gca aga aag tct att ata 864 
Thr Ala Ser Glu Arg Lys Asn Lys He Val Ala Arg Lys Ser He He 
275 280 285 

get aaa aca gag ata aaa aaa ggt gag gtt ttt tea gaa aaa aat ata 912 
Ala Lys Thr Glu He Lys Lys Gly Glu Val Phe Ser Glu Lys Asn He 
290 295 300 



aca aca aaa aga ect ggt aat ggt ate agt ccg atg gag tgg tat aat 
Thr Thr Lys Arg Pro Gly Asn Gly He Ser Pro Met Glu Trp Tyr Asn 
305 310 315 320 



960 
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tta ttg ggt aaa att gca gag caa gac ttt att cca gat gaa tta ata 1008 
Leu Leu Gly Lys He Ala Glu Gin Asp Phe He Pro Asp Glu Leu He 
325 330 335 



att cat age gaa ttc aaa aat cag ggg gaa taa tgagaacaaa aattattg 1059 
He His Ser Glu Phe Lys Asn Gin Gly Glu 
340 345 



<210> 8 

<211> 346 

<212> PRT 

<213> Homo sapiens 



<400> 8. 



Met 


Ser 


Asn 


He 


Tyr 


He 


Val 


Ala 


Glu 


He Gly 


Cys 


Asn 


His 


Asn 


Gly 


1 








5 










10. 










15 




Ser 


Val 


Asp 


He 


Ala 


Arg Glu Met 


He 


Leu 


Lys 


Ala 


Lys 


Glu 


Ala 


Gly 








20 










25 










30 






Val 


Asn 


Ala 
35 


Val 


Lys 


Phe 


Gin 


Thr 
40 


Phe 


Lys 


Ala 


Asp 


Lys 

45 


Leu 


He 


Ser 


Ala 


He 


Ala 


Pro 


Lys 


Ala Glu Tyr Gin 


He 


Lys 


Asn 


Thr 


Gly 


Glu 


Leu 




50 










55 










60 










Glu 


Ser 


Gin 


Leu 


Glu 


Met 


Thr 


Lys 


Lys 


Leu 


Glu 


Met 


Lys 


Tyr 


Asp 


Asp 


65 










70 










75 










80 


Tyr 


Leu 


His 


Leu 


Met 
85 


Glu 


Tyr 


Ala 


Val 


Ser 
90 


Leu 


Asn 


Leu 


Asp 


Val 
95 


Phe 


Ser 


Thr 


Pro 


Phe 


Asp 


Glu Asp Ser 


He 


Asp 


Phe 


Leu 


Ala 


Ser 


Leu 


Lys 








100 










105 










110 






Gin 


Lys 


He 


Trp 


Lys 


He 


Pro 


Ser Gly 


Glu 


Leu 


Leu 


Asn 


Leu 


Pro 


Tyr 






115 










120 










125 








Leu 


Glu 
130 


Lys 


He 


Ala 


Lys 


Leu 
135 


Pro 


He 


Pro 


Asp 


Lys 
140 


Lys 


He 


He 


He 


Ser 


Thr 


Gly 


Met 


Ala 


Thr 


He 


Asp 


Glu 


He 


Lys 


Gin 


Ser 


Val 


Ser 


He 


145 










150 










155 










160 


Phe 


He 


Asn 


Asn 


Lys 
165 


Val 


Pro 


Val 


Gly 


Asn 
170 


He 


Thr 


He 


Leu 


His 
175 


Cys 


Asn 


Thr 


Glu 


Tyr 
180 


Pro 


Thr 


Pro 


Phe 


Glu 
185 


Asp 


Val 


Asn 


Leu 


Asn 
190 


Ala 


He 


Asn Asp 


Leu 


Lys 


Lys 


His 


Phe 


Pro 


Lys 


Asn 


Asn 


He 


Gly 


Phe 


Ser 


Asp 






195 










200 










205 








His 


Ser 
210 


Ser 


Gly 


Phe 


Tyr 


Ala 
215 


Ala 


He 


Ala 


Ala 


Val 
220 


Pro 


Tyr 


Gly 


He 


Thr 


Phe 


He 


Glu 


Lys 


His 


Phe 


Thr 


Leu 


Asp 


Lys 


Ser 


Met 


Ser 


Gly 


Pro 


225 










230 










235 










240 


Asp 


His 


Leu 


Ala 


Ser 


He 


Glu 


Pro Asp 


Glu 


Leu 


Lys 


His 


Leu 


Cys 


He 










245 










250 










255 




Gly Val 


Arg 


Cys 


Val 


Glu 


Lys 


Ser 


Leu 


Gly 


Ser 


Asn 


Ser 


Lys 


Val 


Val 








260 










265 










270 






Thr 


Ala 


Ser 
275 


Glu 


Arg 


Lys 


Asn 


Lys 
280 


He 


Val 


Ala 


Arg 


Lys 
285 


Ser 


He 


He 


Ala 


Lys 


Thr 


Glu 


He 


Lys 


Lys 


Gly Glu 


Val 


Phe 


Ser 


Glu 


Lys 


Asn 


He 




290 










295 










300 










Thr Thr 


Lys 


Arg 


Pro 


Gly Asn Gly He 


Ser 


Pro 


Met 


Glu 


Trp 


Tyr 


Asn 


305 










310 










315 










320 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 
(PCTRule 13to) 



A. The indications made below relate to the microorganismreferred to in the description 
on page 33 ,line 21 



B. BDENTIFICATIONOFDEPOSrr Furtherdepositsareidentifiedonanadditionalsheet [ [ 



Nameofdepositaryinstitution American Type Culture Collection 



Address of depositary institution (including postal code and country) 

10801 University Boulevard 
Manassas, Virginia 20110-2209 
United States of America 



Date of deposit 




AccessionNumber 






24 February 2000 




PTA-1 410 



C. ADDITIONAL INDICATIONS (leave blank if not applicable) This information is continued on an additional sheet [ | 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the indications are notfor all designated States) 
Europe 

In respect to those designations in which a European Patent is sought a sample of the deposited 
microorganism will be made available until the publication of the mention of the grant of the European patent 
or until the date on which application has been refused or withdrawn or is deemed to be withdrawn, only by 
the issue of such a sample to an expert nominated by the person requesting the sample (Rule 28 (4) EPC). 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 



The indications listed belowwill be submitted to the International Bureau later (specify the general nature of the indications e.g., "Accession 
Number of Deposit") 



ForreceivingOfficeuse only 



"| This sheet was received with the international application 



Authorized officer 



For International Bureau use only 



This sheet was receivedby the International Bureau on: 



.0 h MAY 2000 



Authorized officer 



Form PCT/RO/134 (July 1992) 
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ATCC Deposit No.: unassigned 
CANADA 

The applicant requests that, until either a Canadian patent has been issued on the basis of an - 
application or the application has been refused, or is abandoned and no longer subject to 
reinstatement, or is withdrawn, the Commissioner of Patents only authorizes the furnishing of 
a sample of the deposited biological material refen - ed to in the application to an independent 
expert nominated by the Commissioner, the applicant must, by a written statement, inform 
the International Bureau accordingly before completion of technical preparations for 
publication of the international application. 

NORWAY 

The applicant hereby requests that the application has been laid open to public inspection (by 
the Norwegian Patent Office), or has been finally decided upon by the Norwegian Patent 
Office without having been laid open inspection, the funhshing of a sample shall only be 
effected to an expert in the ait. The request to this effect shall be filed by the applicant with 
the Norwegian Patent Office not later than at the time when the application is made available 
to the public under Sections 22 and 33(3) of the Norwegian Patents Act. If such a request has 
been filed by the applicant, any request made by a third party for the furnishing of a sample 
shall indicate the expert to be used. That expert may be any person entered on the list of 
recognized experts drawn up by the Norwegian Patent Office or any person approved by the 
applicant in the individual case. 

AUSTRALIA 

The applicant hereby gives notice that the furnishing of a sample of a microorganism shall 
only be effected prior to the grant of a patent, or prior to the lapsing, refusal or withdrawal of 
the application, to a person who is a skilled addressee without an interest in the invention 
(Regulation 3 .25(3) of the Australian Patents Regulations). 

FINLAND 

The applicant hereby requests that, until the application has been laid open to public 
inspection (by the National Board of Patents and Regulations), or has been finally decided 
upon by the National Board of Patents and Registration without having been laid open to 
public inspection, the furnishing of a sample shall only be effected to an expert in the art. 

UNITED KINGDOM 

The applicant hereby requests that the furnishing of a sample of a microorganism shall only 
be made available to an expert. The request to this effect must be filed by the applicant with 
the International Bureau before the completion of the technical preparations for the 
international publication of the application. 
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ATCC Deposit No.:unassigned 



DENMARK 

The applicant hereby requests that, until the application has been laid open to public 
inspection (by the Danish Patent Office), or has been finally decided upon by the Danish 
Patent office without having been laid open to public inspection, the furnishing of a sample 
shall only be effected to an expert in the art. The request to this effect shall be filed by the 
applicant with the Danish Patent Office not later that at the time when the application is made 
available to the public imder Sections 22 and 33(3) of the Danish Patents Act. If such a 
request has been filed by the applicant, any request made by a third party for the furnishing of 
a sample shall indicate the expert to be used. That expert may be any person entered on a list 
of recognized experts drawn up by the Danish Patent Office or any person by the applicant in 
the individual case. 

SWEDEN 

The applicant hereby requests that, until the application has been laid open to public 
inspection (by the Swedish Patent Office), or has been finally decided upon by the Swedish 
Patent Office without having been laid open to public inspection, the furnishing of a sample 
shall only be effected to an expert in the art. The request to this effect shall be filed by the 
applicant with the International Bureau before the expiration of 16 months from the priority 
date (preferably on the Form PCT/RO/134 reproduced in annex Z of Volume I of the PCT 
Applicant's Guide). If such a request has been filed by the applicant any request made by a 
third party for the furnishing of a sample shall indicate the expert to be used. That expert may 
be any person entered on a list of recognized experts drawn up by the Swedish Patent Office 
or any person approved by a applicant in the individual case. 

NETHERLANDS 

The applicant hereby requests that until the date of a grant of a Netherlands patent or until the 
date on which the application is refused or withdrawn or lapsed, the microorganism shall be 
made available as provided in the 3 1F(1) of the Patent Rules only by the issue of a sample to 
an expert. The request to this effect must be furnished by the applicant with the Netherlands 
Industrial Property Office before the date on which the application is made available to the 
public under Section 22C or Section 25 of the Patents Act of the Kingdom of the 
Netherlands, whichever of the two dates occurs earlier. 



